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Product and Method 



This application is a 371 of PCT/GB2003/005102, 
filed November 21, 2003, the disclosure of which is 
incorporated herein by reference. 

A Sequence Listing on a single CD-ROM was filed 
with this application (file name: Q87920.ST25.txt). The 
Sequence Listing contains each of the polynucleotide and 
polypeptide sequences disclosed herein. The Sequence 
Listing is incorporated herein by reference. 

The present invention relates to oligonucleotide 
probes, for use in assessing gene transcript levels in a 
cell, which may be used in analytical techniques, 
particularly diagnostic techniques. Conveniently the 
probes are provided in kit form. Different sets of 
probes may be used in techniques to prepare gene 
expression patterns and identify, diagnose or monitor 
different states, such as diseases, conditions or stages 
thereof. Also provided are methods of identifying 
suitable probes and their use in methods of the 
invention . 

The identification of quick and easy methods of 
sample analysis for, for example, diagnostic 
applications, remains the goal of many researchers. End 
users seek methods which are cost effective, produce 
statistically significant results and which may be 
implemented routinely without the need for highly 
skilled individuals . 

The analysis of gene expression within cells has 
been used to provide information on the state of those 
cells and importantly the state of the individual from 
which the cells are derived. The relative expression of 
various genes in a cell has been identified as 
reflecting a particular state within a body. For 
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example, cancer cells are known to exhibit altered 
expression of various proteins and the transcripts or 
the expressed proteins may therefore be used as markers 
of that disease state. 

Thus biopsy tissue may be analysed for the presence 
of these markers and cells originating from the site of 
the disease may be identified in other tissues or fluids 
of the body by the presence of the markers. 
Furthermore, products of the altered expression may be 
released into the blood stream and these products may be 
analysed. In addition cells which have contacted 
disease cells may be affected by their direct contact 
with those cells resulting in altered gene expression 
and their expression or products of expression may be 
similarly analysed . 

However, there are some limitations with these 
methods. For example, the use of specific tumour 
markers for identifying cancer suffers from a variety of 
defects, such as lack of specificity or sensitivity, 
association of the marker with disease states besides 
the specific type of cancer, and difficulty of detection 
in asymptomatic individuals. 

In addition to the analysis of one or two marker 
transcripts or proteins, more recently, gene expression 
patterns have been analysed. Most of the work involving 
large-scale gene expression analysis with implications 
in disease diagnosis has involved clinical samples 
originating from diseased tissues or cells. For 
example, several recent publications, which demonstrate 
that gene expression data can be used to distinguish 
between similar cancer types, have used clinical samples 
from diseased tissues or cells (Alon et al . 1999, PNAS, 
96, p6745-6750; Golub et al . 1999, Science, 286, p531- 
537; Alizadeh et al, 2000, Nature, 403, p503-511; 
Bittner et al . , 2000, Nature, 406, p536-540). 

However, these methods have relied on analysis of a 
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sample containing diseased cells or products of those 
cells or cells which have been contacted by disease 
cells. Analysis of such samples relies on knowledge of 
the presence of a disease and its location, which may be 
difficult in asymptomatic patients. Furthermore, 
samples can not always be taken from the disease site, 
e.g. in diseases of the brain. 

In a finding of great significance, the present 
inventors identified the previously untapped potential 
of all cells within a body to provide information 
relating to the state of the organism from which the 
cells were derived. W098/49342 describes the analysis 
of the gene expression of cells distant from the site of 
disease, e.g. peripheral blood collected distant from a 
cancer site. 

This finding is based on the premise that the 
different parts of an organism's body exist in dynamic 
interaction with each other. When a disease affects one 
part of the body, other parts of the body are also 
affected. The interaction results from a wide spectrum 
of biochemical signals that are released from the 
diseased area, affecting other areas in the body. 
Although, the nature of the biochemical and 
physiological changes induced by the released signals 
can vary in the different body parts, the changes can be 
measured at the level of gene expression and used for 
diagnostic purposes . 

The physiological state of a cell in an organism is 
determined by the pattern with which genes are expressed 
in it. The pattern depends upon the internal and 
external biological stimuli to which said cell is 
exposed, and any change either in the extent or in the 
nature of these stimuli can lead to a change in the 
pattern with which the different genes are expressed in 
the cell. There is a growing understanding that by 
analysing the systemic changes in gene expression 
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patterns in cells in biological samples, it is possible 
to provide information on the type and nature of the 
biological stimuli that are acting on them. Thus, for 
example, by monitoring the expression of a large number 
of genes in cells in a test sample, it is possible to 
determine whether their genes are expressed with a 
pattern characteristic for a particular disease, 
condition or stage thereof. Measuring changes in gene 
activities in cells, e.g. from tissue or body fluids is 
therefore emerging as a powerful tool for disease 
diagnosis . 

Such methods have various advantages. Often, 
obtaining clinical samples from certain areas in the 
body that is diseased can be difficult and may involve 
undesirable invasions in the body, for example biopsy is 
often used to obtain samples for cancer. In some cases, 
such as in Alzheimer's disease the diseased brain 
specimen can only be obtained post-mortem. Furthermore, 
the tissue specimens which are obtained are often 
heterogeneous and may contain a mixture of both diseased 
and non-diseased cells, making the analysis of generated 
gene expression data both complex and difficult. 

It has been suggested that a pool of tumour tissues 
that appear to be pathogenetically homogeneous with 
respect to morphological appearances of the tumour may 
well be highly heterogeneous at the molecular level 
(Alizadeh, 2000, supra), and in fact might contain 
tumours representing essentially different diseases 
(Alizadeh, 2000, supra; Golub, 1999, supra) . For the 
purpose of identifying a disease, condition, or a stage 
thereof, any method that does not require clinical 
samples to originate directly from diseased tissues or 
cells is highly desirable since clinical samples 
representing a homogeneous mixture of cell types can be 
obtained from an easily accessible region in the body. 

We have now identified a set of probes of 
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surprising utility for identifying one or more diseases. 

Thus, we now describe probes and sets of probes derived 
from cells which are not disease cells and which have 
not contacted disease cells, which correspond to genes 
which exhibit altered expression in normal versus 
disease individuals, for use in methods of identifying, 
diagnosing or monitoring certain conditions, 
particularly diseases or stages thereof. 

Thus the invention provides a set of 
oligonucleotide probes which correspond to genes in a 
cell whose expression is affected in a pattern 
characteristic of a particular disease, condition or 
stage thereof, wherein said genes are systemically 
affected by said disease, condition or stage thereof. 
Preferably said genes are metabolic or house-keeping 
genes and preferably are consti tutively moderately or 
highly expressed. Preferably the genes are moderately 
or highly expressed in the cells of the sample but not 
in cells from disease cells or in cells having contacted 
such disease cells. 

Such probes, particularly when isolated from cells 
distant to the site of disease, do not rely on the 
development of disease to clinically recognizable levels 
and allow detection of a disease or condition or stage 
thereof very early after the onset of said disease or 
condition, even years before other subjective or 
objective symptoms appear. 

As used herein "systemically" affected genes refers 
to genes whose expression is affected in the body 
without direct contact with a disease cell or disease 
site and the cells under investigation are not disease 
cells . 

"Contact" as referred to herein refers to cells 
coming into close proximity with one another such that 
the direct effect of one cell on the other may be 
observed, e.g. an immune response, wherein these 
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responses are not mediated by secondary molecules 
released from the first cell over a large distance to 
affect the second cell. Preferably contact refers to 
physical contact, or contact that is as close as is 
sterically possible, conveniently, cells which contact 
one another are found in the same unit volume, for 
example within 1cm 3 . 

A "disease cell" is a cell manifesting phenotypic 
changes and is present at the disease site at some time 
during its life-span, e.g. a tumour cell at the tumour 
site or which has disseminated from the tumour, or a 
brain cell in the case of brain disorders such as 
Alzheimer f s disease . 

"Metabolic" or "house-keeping" genes refer to those 
genes responsible for expressing products involved in 
cell division and maintenance, e.g. non-immune function 
related genes. 

"Moderately or highly" expressed genes refers to 
those present in resting cells in a copy number of more 
than 30-100 copies/cell (assuming an average 3xl0 5 mRNA 
molecules in a cell) . 

Specific probes having the above described 
properties are provided herein. 

Thus in one aspect, the present invention provides 
a set of oligonucleotide probes, wherein said set 
comprises at least 10 oligonucleotides selected from: 

an oligonucleotide as described in Table 1 or 

derived from a sequence described in Table 1, or an 

oligonucleotide with a complementary sequence, 

or a functionally equivalent oligonucleotide. 

"Table 1" as referred to herein refers to Table la 
and/or Table lb. Table lb contains reference to 
additional clones and sequences as disclosed herein. 
Similarly Tables 2 and 4 comprise 2 parts, a and b. 

The invention also provides one or more 
oligonucleotide probes, wherein each oligonucleotide 
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probe is selected from the oligonucleotides listed in 
Table 1, or derived from a sequence described in Table 
1, or a complementary sequence thereof. The use of such 
probes in products and methods of the invention, form 
further aspects of the invention. As referred to herein 
an "oligonucleotide" is a nucleic acid molecule having 
at least 6 monomers in the polymeric structure, ie. 
nucleotides or modified forms thereof. The nucleic acid 
molecule may be DNA, RNA or PNA (peptide nucleic acid) 
or hybrids thereof or modified versions thereof, e.g. 
chemically modified forms, e.g. LNA (Locked Nucleic 
acid) , by methylation or made up of modified or non- 
natural bases during synthesis, providing they retain 
their ability to bind to complementary sequences. Such 
oligonucleotides are used in accordance with the 
invention to probe target sequences and are thus 
referred to herein also as oligonucleotide probes or 
simply as probes. 

An "oligonucleotide derived from a sequence 
described in Table 1" (or any other table) refers to a 
part of a sequence disclosed in that Table (e.g. Table 
1-4), which satisfies the requirements of the 
oligonucleotide probes as described herein, e.g. in 
length and function. Preferably said parts have the size 
described hereinafter . 

Preferably the oligonucleotide probes forming said 
set are at least 15 bases in length to allow binding of 
target molecules. Especially preferably said 
oligonucleotide probes are from 20 to 200 bases in 
length, e.g. from 30 to 150 bases, preferably 50-100 
bases in length. 

As referred to herein the term "complementary 
sequences" refers to sequences with consecutive 
complementary bases (ie. T:A, G:C) and which 
complementary sequences are therefore able to bind to 
one another through their complementarity. 
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Reference to "10 oligonucleotides" refers to 10 
different oligonucleotides. Whilst a Table 1 
oligonucleotide, a Table 1 derived oligonucleotide and 
their functional equivalent are considered different 
oligonucleotides, complementary oligonucleotides are not 
considered different. Preferably however, the at least 
10 oligonucleotides are 10 different Table 1 
oligonucleotides (or Table 1 derived oligonucleotides or 
their functional equivalents) . Thus said 10 different 
oligonucleotides are preferably able to bind to 10 
different transcripts . 

Preferably said oligonucleotides are as described 
in Table 1 or are derived from a sequence described in 
Table 1. Especially preferably said oligonucleotides 
are as described in Table 2 or Table 4 or are derived 
from a sequence described in either of those tables. 
Especially preferably the oligonucleotide (or the 
oligonucleotide derived therefrom) has a high occurrence 
as defined in Table 3, especially preferably >40%, e.g. 
>80 or >90, e.g. 100%. 

A "set" as described refers to a collection of 
unique oligonucleotide probes (ie. having a distinct 
sequence) and preferably consists of less than 1000 
oligonucleotide probes, especially less than 500 probes, 
e.g. preferably from 10 to 500, e.g. 10 to 100, 200 or 
300, especially preferably 20 to 100, e.g. 30 to 100 
probes. In some cases less than 10 probes may be used, 
e.g. from 2 to 9 probes, e.g. 5 to 9 probes. 

It will be appreciated that increasing the number 
of probes will prevent the possibility of poor analysis, 
e.g. misdiagnosis by comparison to other diseases which 
could similarly alter the expression of the particular 
genes in question. Other oligonucleotide probes not 
described herein may also be present, particularly if 
they aid the ultimate use of the set of oligonucleotide 
probes. However, preferably said set consists only of 
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said Table 1 oligonucleotides, Table 1 derived 
oligonucleotides, complementary sequences or 
functionally equivalent oligonucleotides, or a sub-set 
thereof (e.g. of the size as described above), 
preferably a sub-set for which sequences are provided 
herein (see Table 1 and its footnote) . Especially 
preferably said set consists only of said Table 1 
oligonucleotides, Table 1 derived oligonucleotides, or 
complementary sequences thereof, or a sub-set thereof. 

Multiple copies of each unique oligonucleotide 
probe, e.g. 10 or more copies, may be present in each 
set, but constitute only a single probe. 

A set of oligonucleotide probes, which may 
preferably be immobilized on a solid support or have 
means for such immobilization, comprises the at least 10 
oligonucleotide probes selected from those described 
hereinbefore. Especially preferably said probes are 
selected from those having high occurrence as described 
in Table 3 and as mentioned above. As mentioned above, 
these 10 probes must be unique and have different 
sequences. Having said this however, two separate 
probes may be used which recognize the same gene but 
reflect different splicing events. However 
oligonucleotide probes which are complementary to, and 
bind to distinct genes are preferred. 

As described herein a "functionally equivalent" 
oligonucleotide to those described in Table 1 or derived 
therefrom refers to an oligonucleotide which is capable 
of identifying the same gene as an oligonucleotide of 
Table 1 or derived therefrom, ie. it can bind to the 
same mRNA molecule (or DNA) transcribed from a gene 
(target nucleic acid molecule) as the Table 1 
oligonucleotide or the Table 1 derived oligonucleotide 
(or its complementary sequence) . Preferably said 
functionally equivalent oligonucleotide is capable of 
recognizing, ie. binding to the same splicing product as 
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a Table 1 oligonucleotide or a Table 1 derived 
oligonucleotide. Preferably said mRNA molecule is the 
full length mRNA molecule which corresponds to the Table 
1 oligonucleotide or the Table 1 derived 
oligonucleotide . 

As referred to herein "capable of binding" or 
"binding" refers to the ability to hybridize under 
conditions described hereinafter. 

Alternatively expressed, functionally equivalent 
oligonucleotides (or complementary sequences) have 
sequence identity or will hybridize, as described 
hereinafter, to a region of the target molecule to which 
molecule a Table 1 oligonucleotide or a Table 1 derived 
oligonucleotide or a complementary oligonucleotide 
binds. Preferably, functionally equivalent 
oligonucleotides (or their complementary sequences) 
hybridize to one of the mRNA sequences which corresponds 
to a Table 1 oligonucleotide or a Table 1 derived 
oligonucleotide under the conditions described 
hereinafter or has sequence identity to a part of one of 
the mRNA sequences which corresponds to a Table 1 
oligonucleotide or a Table 1 derived oligonucleotide. A 
"part" in this context refers to a stretch of at least 
5, e.g. at least 10 or 20 bases, such as from 5 to 100, 
e.g. 10 to 50 or 15 to 30 bases. 

In a particularly preferred aspect, the 
functionally equivalent oligonucleotide binds to all or 
a part of the region of a target nucleic acid molecule 
(mRNA or cDNA) to which the Table 1 oligonucleotide or 
Table 1 derived oligonucleotide binds. A "target" 
nucleic acid molecule is the gene transcript or related 
product e.g. mRNA, or cDNA, or amplified product 
thereof. Said "region" of said target molecule to which 
said Table 1 oligonucleotide or Table 1 derived 
oligonucleotide binds is the stretch over which 
complementarity exists. At its largest this region is 
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the whole length of the Table 1 oligonucleotide or Table 
1 derived oligonucleotide, but may be shorter if the 
entire Table 1 sequence or Table 1 derived 
oligonucleotide is not complementary to a region of the 
target sequence. 

Preferably said part of said region of said target 
molecule is a stretch of at least 5, e.g. at least 10 or 
20 bases, such as from 5 to 100, e.g. 10 to 50 or 15 to 
30 bases. This may for example be achieved by said 
functionally equivalent oligonucleotide having several 
identical bases to the bases of the Table 1 
oligonucleotide or the Table 1 derived oligonucleotide. 

These bases may be identical over consecutive 
stretches, e.g. in a part of the functionally equivalent 
oligonucleotide, or may be present non-consecutively , 
but provide sufficient complementarity to allow binding 
to the target sequence. 

Thus in a preferred feature, said functionally 
equivalent oligonucleotide hybridizes under conditions 
of high stringency to a Table 1 oligonucleotide or a 
Table 1 derived oligonucleotide or the complementary 
sequence thereof. Alternatively expressed, said 
functionally equivalent oligonucleotide exhibits high 
sequence identity to all or part of a Table 1 
oligonucleotide. Preferably said functionally 
equivalent oligonucleotide has at least 70% sequence 
identity, preferably at least 80%, e.g. at least 90, 95, 
98 or 99%, to all of a Table 1 oligonucleotide or a part 
thereof. As used in this context, a "part" refers to a 
stretch of at least 5, e.g. at least 10 or 20 bases, 
such as from 5 to 100, e.g. 10 to 50 or 15 to 30 bases, 
in said Table 1 oligonucleotide. Especially preferably 
when sequence identity to only a part of said Table 1 
oligonucleotide is present, the sequence identity is 
high, e.g. at least 80% as described above. 

Functionally equivalent oligonucleotides which 
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satisfy the above stated functional requirements include 
those which are derived from the Table 1 

oligonucleotides and also those which have been modified 
by single or multiple nucleotide base (or equivalent) 
substitution, addition and/or deletion, but which 
nonetheless retain functional activity, e.g. bind to the 
same target molecule as the Table 1 oligonucleotide or 
the Table 1 derived oligonucleotide from which they are 
further derived or modified. Preferably said 
modification is of from 1 to 50, e.g. from 10 to 30, 
preferably from 1 to 5 bases. Especially preferably 
only minor modifications are present, e.g. variations in 
less than 10 bases, e.g. less than 5 base changes. 

Within the meaning of "addition" equivalents are 
included oligonucleotides containing additional 
sequences which are complementary to the consecutive 
stretch of bases on the target molecule to which the 
Table 1 oligonucleotide or the Table 1 derived 
oligonucleotide binds. Alternatively the addition may 
comprise a different, unrelated sequence, which may for 
example confer a further property, e.g. to provide a 
means for immobilization such as a linker to bind the 
oligonucleotide probe to a solid support. 

Particularly preferred are naturally occurring 
equivalents such as biological variants, e.g. allelic, 
geographical or allotypic variants, e.g. 

oligonucleotides which correspond to a genetic variant, 
for example as present in a different species. 

Functional equivalents include oligonucleotides 
with modified bases, e.g. using non-naturally occurring 
bases. Such derivatives may be prepared during 
synthesis or by post production modification. 

"Hybridizing" sequences which bind under conditions 
of low stringency are those which bind under non- 
stringent conditions (for example, 6x SSC/50% formamide 
at room temperature) and remain bound when washed under 
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conditions of low stringency (2 X SSC, room temperature, 
more preferably 2 X SSC, 42°C) . Hybridizing under high 
stringency refers to the above conditions in which 
washing is performed at 2 X SSC, 65°C (where SSC = 0.15M 
NaCl, 0.015M sodium citrate, pH 7.2). 

"Sequence identity" as referred to herein refers to 
the value obtained when assessed using ClustalW 
(Thompson et al . , 1994, Nucl . Acids Res., 22, p4673- 
4680) with the following parameters: 
Pairwise alignment parameters - Method: accurate, 
Matrix: IUB, Gap open penalty: 15.00, Gap extension 
penalty: 6.66; 

Multiple alignment parameters - Matrix: IUB, Gap open 
penalty: 15.00, % identity for delay: 30, Negative 
matrix: no, Gap extension penalty: 6.66, DNA transitions 
weighting : 0.5. 

Sequence identity at a particular base is intended 
to include identical bases which have simply been 
derivatized. 

The invention also extends to polypeptides encoded 
by the mRNA sequence to which a Table 1 oligonucleotide 
or a Table 1 derived oligonucleotide binds. The 
invention further extends to antibodies which bind to 
any of said polypeptides. 

As described above, conveniently said set of 
oligonucleotide probes may be immobilized on one or more 
solid supports. Single or preferably multiple copies of 
each unique probe are attached to said solid supports, 
e.g. 10 or more, e.g. at least 100 copies of each unique 
probe are present. 

One or more unique oligonucleotide probes may be 
associated with separate solid supports which together 
form a set of probes immobilized on multiple solid 
support, e.g. one or more unique probes may be 
immobilized on multiple beads, membranes, filters, 
biochips etc. which together form a set of probes, which 
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together form modules of the kit described hereinafter. 

The solid support of the different modules are 
conveniently physically associated although the signals 
associated with each probe (generated as described 
hereinafter) must be separately determinable. 

Alternatively, the probes may be immobilized on 
discrete portions of the same solid support, e.g. each 
unique oligonucleotide probe, e.g. in multiple copies, 
may be immobilized to a distinct and discrete portion or 
region of a single filter or membrane, e.g. to generate 
an array. 

A combination of such techniques may also be used, 
e.g. several solid supports may be used which each 
immobilize several unique probes. 

The expression "solid support" shall mean any solid 
material able to bind oligonucleotides by hydrophobic, 
ionic or covalent bridges. 

"Immobilization" as used herein refers to 
reversible or irreversible association of the probes to 
said solid support by virtue of such binding. If 
reversible, the probes remain associated with the solid 
support for a time sufficient for methods of the 
invention to be carried out. 

Numerous solid supports suitable as immobilizing 
moieties according to the invention, are well known in 
the art and widely described in the literature and 
generally speaking, the solid support may be any of the 
well-known supports or matrices which are currently 
widely used or proposed for immobilization, separation 
etc. in chemical or biochemical procedures. Such 
materials include, but are not limited to, any synthetic 
organic polymer such as polystyrene, polyvinylchloride, 
polyethylene; or nitrocellulose and cellulose acetate; 
or tosyl activated surfaces; or glass or nylon or any 
surface carrying a group suited for covalent coupling of 
nucleic acids. The immobilizing moieties may take the 
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form of particles, sheets, gels, filters, membranes, 
microfibre strips, tubes or plates, fibres or 
capillaries, made for example of a polymeric material 
e.g. agarose, cellulose, alginate, teflon, latex or 
polystyrene or magnetic beads. Solid supports allowing 
the presentation of an array, preferably in a single 
dimension are preferred, e.g. sheets, filters, 
membranes, plates or biochips. 

Attachment of the nucleic acid molecules to the 
solid support may be performed directly or indirectly. 
For example if a filter is used, attachment may be 
performed by UV-induced crosslinking . Alternatively, 
attachment may be performed indirectly by the use of an 
attachment moiety carried on the oligonucleotide probes 
and/or solid support. Thus for example, a pair of 
affinity binding partners may be used, such as avidin, 
streptavidin or biotin, DNA or DNA binding protein (e.g. 
either the lac I repressor protein or the lac operator 
sequence to which it binds) , antibodies (which may be 
mono- or polyclonal) , antibody fragments or the epitopes 
or haptens of antibodies. In these cases, one partner 
of the binding pair is attached to (or is inherently 
part of) the solid support and the other partner is 
attached to (or is inherently part of) the nucleic acid 
molecules . 

As used herein an "affinity binding pair" refers to 
two components which recognize and bind to one another 
specifically (ie. in preference to binding to other 
molecules) . Such binding pairs when bound together form 
a complex. 

Attachment of appropriate functional groups to the 
solid support may be performed by methods well known in 
the art, which include for example, attachment through 
hydroxyl, carboxyl, aldehyde or amino groups which may 
be provided by treating the solid support to provide 
suitable surface coatings. Solid supports presenting 
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appropriate moieties for attachment of the binding 
partner may be produced by routine methods known in the 
art . 

Attachment of appropriate functional groups to the 
oligonucleotide probes of the invention may be performed 
by ligation or introduced during synthesis or 
amplification, for example using primers carrying an 
appropriate moiety, such as biotin or a particular 
sequence for capture. 

Conveniently, the set of probes described 
hereinbefore is provided in kit form. 

Thus viewed from a further aspect the present 
invention provides a kit comprising a set of 
oligonucleotide probes as described hereinbefore 
immobilized on one or more solid supports. 

Preferably, said probes are immobilized on a single 
solid support and each unique probe is attached to a 
different region of said solid support. However, when 
attached to multiple solid supports, said multiple solid 
supports form the modules which make up the kit. 
Especially preferably said solid support is a sheet, 
filter, membrane, plate or biochip. 

Optionally the kit may also contain information 
relating to the signals generated by normal or diseased 
samples (as discussed in more detail hereinafter in 
relation to the use of the kits) , standardizing 
materials, e.g. mRNA or cDNA from normal and/or diseased 
samples for comparative purposes, labels for 
incorporation into cDNA, adapters for introducing 
nucleic acid sequences for amplification purposes, 
primers for amplification and/or appropriate enzymes, 
buffers and solutions. Optionally said kit may also 
contain a package insert describing how the method of 
the invention should be performed, optionally providing 
standard graphs, data or software for interpretation of 
results obtained when performing the invention. 
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The use of such kits to prepare a standard 
diagnostic gene transcript pattern as described 
hereinafter forms a further aspect of the invention. 

The set of probes as described herein have various 
uses. Principally however they are used to assess the 
gene expression state of a test cell to provide 
information relating to the organism from which said 
cell is derived. Thus the probes are useful in 
diagnosing, identifying or monitoring a disease or 
condition or stage thereof in an organism. 

Thus in a further aspect the invention provides the 
use of a set of oligonucleotide probes or a kit as 
described hereinbefore to determine the gene expression 
pattern of a cell which pattern reflects the level of 
gene expression of genes to which said oligonucleotide 
probes bind, comprising at least the steps of: 

a) isolating mRNA from said cell, which may 
optionally be reverse transcribed to cDNA; 

b) hybridizing the mRNA or cDNA of step (a) to a 
set of oligonucleotide probes or a kit as defined 
herein; and 

c) assessing the amount of mRNA or cDNA hybridizing 
to each of said probes to produce said pattern. 

The mRNA and cDNA as referred to in this method, 
and the methods hereinafter, encompass derivatives or 
copies of said molecules, e.g. copies of such molecules 
such as those produced by amplification or the 
preparation of complementary strands, but which retain 
the identity of the mRNA sequence, ie. would hybridize 
to the direct transcript (or its complementary sequence) 
by virtue of precise complementarity, or sequence 
identity, over at least a region of said molecule. It 
will be appreciated that complementarity will not exist 
over the entire region where techniques have been used 
which may truncate the transcript or introduce new 
sequences, e.g. by primer amplification. For 
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convenience, said mRNA or cDNA is preferably amplified 
prior to step b) . As with the oligonucleotides 
described herein said molecules may be modified, e.g. by 
using non-natural bases during synthesis providing 
complementarity remains. Such molecules may also carry 
additional moieties such as signalling or immobilizing 
means . 

The various steps involved in the method of 
preparing such a pattern are described in more detail 
hereinafter . 

As used herein "gene expression" refers to 
transcription of a particular gene to produce a specific 
mRNA product (ie. a particular splicing product). The 
level of gene expression may be determined by assessing 
the level of transcribed mRNA molecules or cDNA 
molecules reverse transcribed from the mRNA molecules or 
products derived from those molecules, e.g. by 
amplification . 

The "pattern" created by this technique refers to 
information which, for example, may be represented in 
tabular or graphical form and conveys information about 
the signal associated with two or more oligonucleotides. 

Preferably said pattern is expressed as an array of 
numbers relating to the expression level associated with 
each probe. 

Preferably, said pattern is established using the 
following linear model: 

y = Xb + f Equation 1 

wherein, X is the matrix of gene expression data and y 
is the response variable, b is the regression 
coefficient vector and f the estimated residual vector. 
Although many different methods can be used to establish 
the relationship provided in equation 1, especially 
preferably the partial Least Squares Regression (PLSR) 
method is used for establishing the relationship in 
equation 1. 
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The probes are thus used to generate a pattern 
which reflects the gene expression of a cell at the time 
of its isolation. The pattern of expression is 
characteristic of the circumstances under which that 
cells finds itself and depends on the influences to 
which the cell has been exposed. Thus, a characteristic 
gene transcript pattern standard or fingerprint 
(standard probe pattern) for cells from an individual 
with a particular disease or condition may be prepared 
and used for comparison to transcript patterns of test 
cells. This has clear applications in diagnosing, 
monitoring or identifying whether an organism is 
suffering from a particular disease, condition or stage 
thereof . 

The standard pattern is prepared by determining the 
extent of binding of total mRNA (or cDNA or related 
product) , from cells from a sample of one or more 
organisms with the disease or condition or stage 
thereof, to the probes. This reflects the level of 
transcripts which are present which correspond to each 
unique probe. The amount of nucleic acid material which 
binds to the different probes is assessed and this 
information together forms the gene transcript pattern 
standard of that disease or condition or stage thereof. 

Each such standard pattern is characteristic of the 
disease, condition or stage thereof. 

In a further aspect therefore, the present 
invention provides a method of preparing a standard gene 
transcript pattern characteristic of a disease or 
condition or stage thereof in an organism comprising at 
least the steps of: 

a) isolating mRNA from the cells of a sample of one 
or more organisms having the disease or condition or 
stage thereof, which may optionally be reverse 
transcribed to cDNA; 

b) hybridizing the mRNA or cDNA of step (a) to a 
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set of oligonucleotides or a kit as described 
hereinbefore specific for said disease or condition or 
stage thereof in an organism and sample thereof 
corresponding to the organism and sample thereof under 
investigation; and 

c) assessing the amount of mRNA or cDNA hybridizing 
to each of said probes to produce a characteristic 
pattern reflecting the level of gene expression of genes 
to which said oligonucleotides bind, in the sample with 
the disease, condition or stage thereof. 

For convenience, said oligonucleotides are 
preferably immobilized on one or more solid supports. 

The standard pattern for a great number of diseases 
or conditions and different stages thereof using 
particular probes may be accumulated in databases and be 
made available to laboratories on request. 

"Disease" samples and organisms as referred to 
herein refer to organisms (or samples from the same) 
with an underlying pathological disturbance relative to 
a normal organism (or sample) , in a symptomatic or 
asymptomatic organism, which may result, for example, 
from infection or an acquired or congenital genetic 
imperfection. Such organisms are known to have, or 
which exhibit, the disease or condition or stage thereof 
under study. 

A "condition" refers to a state of the mind or body 
of an organism which has not occurred through disease, 
e.g. the presence of an agent in the body such as a 
toxin, drug or pollutant, or pregnancy. 

"Stages" thereof refer to different stages of the 
disease or condition which may or may not exhibit 
particular physiological or metabolic changes, but do 
exhibit changes at the genetic level which may be 
detected as altered gene expression. It will be 
appreciated that during the course of a disease or 
condition the expression of different transcripts may 
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vary. Thus at different stages, altered expression may 
not be exhibited for particular transcripts compared to 
"normal" samples. However, combining information from 
several transcripts which exhibit altered expression at 
one or more stages through the course of the disease or 
condition can be used to provide a characteristic 
pattern which is indicative of a particular stage of the 
disease or condition. Thus for example different stages 
in cancer, e.g. pre-stage I, stage I, stage II, II or IV 
can be identified. 

"Normal" as used herein refers to organisms or 
samples which are used for comparative purposes. 
Preferably, these are "normal" in the sense that they do 
not exhibit any indication of, or are not believed to 
have, any disease or condition that would affect gene 
expression, particularly in respect of the disease for 
which they are to be used as the normal standard. 
However, it will be appreciated that different stages of 
a disease or condition may be compared and in such 
cases, the "normal" sample may correspond to the earlier 
stage of the disease or condition. 

As used herein a "sample" refers to any material 
obtained from the organism, e.g. human or non-human 
animal under investigation which contains cells and 
includes, tissues, body fluid or body waste or in the 
case of prokaryotic organisms, the organism itself. 
"Body fluids" include blood, saliva, spinal fluid, 
semen, lymph. "Body waste" includes urine, expectorated 
matter (pulmonary patients), faeces etc. "Tissue 
samples" include tissue obtained by biopsy, by surgical 
interventions or by other means e.g. placenta. 
Preferably however, the samples which are examined are 
from areas of the body not apparently affected by the 
disease or condition. The cells in such samples are not 
disease cells, e.g. cancer cells, have not been in 
contact with such disease cells and do not originate 
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from the site of the disease or condition. The "site of 
disease" is considered to be that area of the body which 
manifests the disease in a way which may be objectively 
determined, e.g. a tumour or area of inflammation. Thus 
for example peripheral blood may be used for the 
diagnosis of non-haematopoietic cancers, and the blood 
does not require the presence of malignant or 
disseminated cells from the cancer in the blood. 
Similarly in diseases of the brain, in which no diseased 
cells are found in the blood due to the bloodrbrain 
barrier, peripheral blood may still be used in the 
methods of the invention. 

It will however be appreciated that the method of 
preparing the standard transcription pattern and other 
methods of the invention are also applicable for use on 
living parts of eukaryotic organisms such as cell lines 
and organ cultures and explants. As used herein, 
reference to "corresponding" sample etc. refers to cells 
preferably from the same tissue, body fluid or body 
waste, but also includes cells from tissue, body fluid 
or body waste which are sufficiently similar for the 
purposes of preparing the standard or test pattern. 
When used in reference to genes "corresponding" to the 
probes, this refers to genes which are related by 
sequence (which may be complementary) to the probes 
although the probes may reflect different splicing 
products of expression. 

"Assessing" as used herein refers to both 
quantitative and qualitative assessment which may be 
determined in absolute or relative terms. 

The invention may be put into practice as follows. 

To prepare a standard transcript pattern for a 

particular disease, condition or stage thereof, sample 

mRNA is extracted from the cells of tissues, body fluid 

or body waste according to known techniques (see for 
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example Sambrook et. al . (1989), Molecular Cloning : A 
laboratory manual, 2nd Ed., Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y.) from a 
diseased individual or organism. 

Owing to the difficulties in working with RNA, the 
RNA is preferably reverse transcribed at this stage to 
form first strand cDNA. Cloning of the cDNA or 
selection from, or using, a cDNA library is not however 
necessary in this or other methods of the invention. 
Preferably, the complementary strands of the first 
strand cDNAs are synthesized, ie. second strand cDNAs, 
but this will depend on which relative strands are 
present in the oligonucleotide probes. The RNA may 
however alternatively be used directly without reverse 
transcription and may be labelled if so required. 

Preferably the cDNA strands are amplified by known 
amplification techniques such as the polymerase chain 
reaction (PCR) by the use of appropriate primers. 
Alternatively, the cDNA strands may be cloned with a 
vector, used to transform a bacteria such as E. coli 
which may then be grown to multiply the nucleic acid 
molecules. When the sequence of the cDNAs are not 
known, primers may be directed to regions of the nucleic 
acid molecules which have been introduced. Thus for 
example, adapters may be ligated to the cDNA molecules 
and primers directed to these portions for amplification 
of the cDNA molecules. Alternatively, in the case of 
eukaryotic samples, advantage may be taken of the polyA 
tail and cap of the RNA to prepare appropriate primers. 

To produce the standard diagnostic gene transcript 
pattern or fingerprint for a particular disease or 
condition or stage thereof, the above described 
oligonucleotide probes are used to probe mRNA or cDNA of 
the diseased sample to produce a signal for 
hybridization to each particular oligonucleotide probe 
species, ie. each unique probe. A standard control gene 
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transcript pattern may also be prepared if desired using 
mRNA or cDNA from a normal sample. Thus, mRNA or cDNA 
is brought into contact with the oligonucleotide probe 
under appropriate conditions to allow hybridization. 

When multiple samples are probed, this may be 
performed consecutively using the same probes, e.g. on 
one or more solid supports, ie. on probe kit modules, or 
by simultaneously hybridizing to corresponding probes, 
e.g. the modules of a corresponding probe kit. 

To identify when hybridization occurs and obtain an 
indication of the number of transcript s /cDNA molecules 
which become bound to the oligonucleotide probes, it is 
necessary to identify a signal produced when the 
transcripts (or related molecules) hybridize (e.g. by 
detection of double stranded nucleic acid molecules or 
detection of the number of molecules which become bound, 
after removing unbound molecules, e.g. by washing) . 

In order to achieve a signal, either or both 
components which hybridize (ie. the probe and the 
transcript) carry or form a signalling means or a part 
thereof. This "signalling means" is any moiety capable 
of direct or indirect detection by the generation or 
presence of a signal. The signal may be any detectable 
physical characteristic such as conferred by radiation 
emission, scattering or absorption properties, magnetic 
properties, or other physical properties such as charge, 
size or binding properties of existing molecules (e.g. 
labels) or molecules which may be generated (e.g. gas 
emission etc.). Techniques are preferred which allow 
signal amplification, e.g. which produce multiple signal 
events from a single active binding site, e.g. by the 
catalytic action of enzymes to produce multiple 
detectable products . 

Conveniently the signalling means may be a label 
which itself provides a detectable signal. Conveniently 
this may be achieved by the use of a radioactive or 
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other label which may be incorporated during cDNA 
production, the preparation of complementary cDNA 
strands, during amplification of the target mRNA/cDNA or 
added directly to target nucleic acid molecules. 

Appropriate labels are those which directly or 
indirectly allow detection or measurement of the 
presence of the transcripts/cDNA. Such labels include 
for example radiolabels, chemical labels, for example 
chromophores or fluorophores (e.g. dyes such as 
fluorescein and rhodamine) , or reagents of high electron 
density such as ferritin, haemocyanin or colloidal gold. 

Alternatively, the label may be an enzyme, for example 
peroxidase or alkaline phosphatase, wherein the presence 
of the enzyme is visualized by its interaction with a 
suitable entity, for example a substrate. The label may 
also form part of a signalling pair wherein the other 
member of the pair is found on, or in close proximity 
to, the oligonucleotide probe to which the 
transcript /cDNA binds, for example, a fluorescent 
compound and a quench fluorescent substrate may be used. 

A label may also be provided on a different entity, 
such as an antibody, which recognizes a peptide moiety 
attached to the transcripts/cDNA, for example attached 
to a base used during synthesis or amplification. 

A signal may be achieved by the introduction of a 
label before, during or after the hybridization step. 
Alternatively, the presence of hybridizing transcripts 
may be identified by other physical properties, such as 
their absorbance, and in which case the signalling means 
is the complex itself. 

The amount of signal associated with each 
oligonucleotide probe is then assessed. The assessment 
may be quantitative or qualitative and may be based on 
binding of a single transcript species (or related cDNA 
or other products) to each probe, or binding of multiple 
transcript species to multiple copies of each unique 
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probe. It will be appreciated that quantitative results 
will provide further information for the transcript 
fingerprint of the disease which is compiled. This data 
may be expressed as absolute values (in the case of 
macroarrays) or may be determined relative to a 
particular standard or reference e.g. a normal control 
sample . 

Furthermore it will be appreciated that the 
standard diagnostic gene pattern transcript may be 
prepared using one or more disease samples (and normal 
samples if used) to perform the hybridization step to 
obtain patterns not biased towards a particular 
individual's variations in gene expression. 

The use of the probes to prepare standard patterns 
and the standard diagnostic gene transcript patterns 
thus produced for the purpose of identification or 
diagnosis or monitoring of a particular disease or 
condition or stage thereof in a particular organism 
forms a further aspect of the invention. 

Once a standard diagnostic fingerprint or pattern 
has been determined for a particular disease or 
condition using the selected oligonucleotide probes, 
this information can be used to identify the presence, 
absence or extent or stage of that disease or condition 
in a different test organism or individual. 

To examine the gene expression pattern of a test 
sample, a test sample of tissue, body fluid or body 
waste containing cells, corresponding to the sample used 
for the preparation of the standard pattern, is obtained 
from a patient or the organism to be studied. A test 
gene transcript pattern is then prepared as described 
hereinbefore as for the standard pattern. 

In a further aspect therefore, the present 
invention provides a method of preparing a test gene 
transcript pattern comprising at least the steps of: 

a) isolating mRNA from the cells of a sample of 
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said test organism, which may optionally be reverse 
transcribed to cDNA; 

b) hybridizing the mRNA or cDNA of step (a) to a 
set of oligonucleotides or a kit as described 
hereinbefore specific for a disease or condition or 
stage thereof in an organism and sample thereof 
corresponding to the organism and sample thereof under 
investigation; and 

c) assessing the amount of mRNA or cDNA hybridizing 
to each of said probes to produce said pattern 
reflecting the level of gene expression of genes to 
which said oligonucleotides bind, in said test sample. 

This test pattern may then be compared to one or 
more standard patterns to assess whether the sample 
contains cells having the disease, condition or stage 
thereof . 

Thus viewed from a further aspect the present 
invention provides a method of diagnosing or identifying 
or monitoring a disease or condition or stage thereof in 
an organism, comprising the steps of: 

a) isolating mRNA from the cells of a sample of 
said organism, which may optionally be reverse 
transcribed to cDNA; 

b) hybridizing the mRNA or cDNA of step (a) to a 
set of oligonucleotides or a kit as described 
hereinbefore specific for said disease or 
condition or stage thereof in an organism and 
sample thereof corresponding to the organism 
and sample thereof under investigation; 

c) assessing the amount of mRNA or cDNA 
hybridizing to each of said probes to produce 
a characteristic pattern reflecting the level 
of gene expression of genes to which said 
oligonucleotides bind, in said sample; and 

d) comparing said pattern to a standard 
diagnostic pattern prepared according to the 
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method of the invention using a sample from an 
organism corresponding to the organism and 
sample under investigation to determine the 
presence of said disease or condition or a 
stage thereof in the organism under 
investigation . 
The method up to and including step c) is the 
preparation of a test pattern as described above. 

As referred to herein, "diagnosis" refers to 
determination of the presence or existence of a disease 
or condition or stage thereof in an organism. 
"Monitoring" refers to establishing the extent of a 
disease or condition, particularly when an individual is 
known to be suffering from a disease or condition, for 
example to monitor the effects of treatment or the 
development of a disease or condition, e.g. to determine 
the suitability of a treatment or provide a prognosis. 

The presence of the disease or condition or stage 
thereof may be determined by determining the degree of 
correlation between the standard and test samples 1 
patterns. This necessarily takes into account the range 
of values which are obtained for normal and diseased 
samples. Although this can be established by obtaining 
standard deviations for several representative samples 
binding to the probes to develop the standard, it will 
be appreciated that single samples may be sufficient to 
generate the standard pattern to identify a disease if 
the test sample exhibits close enough correlation to 
that standard. Conveniently, the presence, absence, or 
extent of a disease or condition or stage thereof in a 
test sample can be predicted by inserting the data 
relating to the expression level of informative probes 
in test sample into the standard diagnostic probe 
pattern established according to equation 1. 

Data generated using the above mentioned methods 
may be analysed using various techniques from the most 
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basic visual representation (e.g. relating to intensity) 
to more complex data manipulation to identify underlying 
patterns which reflect the interrelationship of the 
level of expression of each gene to which the various 
probes bind, which may be quantified and expressed 
mathematically. Conveniently, the raw data thus 
generated may be manipulated by the data processing and 
statistical methods described hereinafter, particularly 
normalizing and standardizing the data and fitting the 
data to a classification model to determine whether said 
test data reflects the pattern of a particular disease, 
condition or stage thereof. 

The methods described herein may be used to 
identify, monitor or diagnose a disease, condition or 
ailment or its stage or progression, for which the 
oligonucleotide probes are informative. "Informative" 
probes as described herein, are those which reflect 
genes which have altered expression in the diseases or 
conditions in question, or particular stages thereof. 
Probes of the invention may not be sufficiently 
informative for diagnostic purposes when used alone, but 
are informative when used as one of several probes to 
provide a characteristic pattern, e.g. in a set as 
described hereinbefore . 

Preferably said probes correspond to genes which 
are systemically affected by said disease, condition or 
stage thereof. Especially preferably said genes, from 
which transcripts are derived which bind to probes of 
the invention, are metabolic or house-keeping genes and 
preferably are moderately or highly expressed. The 
advantage of using probes directed to moderately or 
highly expressed genes is that smaller clinical samples 
are required for generating the necessary gene 
expression data set, e.g. less than 1ml blood samples. 

Furthermore, it has been found that such genes 
which are already being actively transcribed tend to be 
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more prone to being influenced, in a positive or 
negative way, by new stimuli. In addition, since 
transcripts are already being produced at levels which 
are generally detectable, small changes in those levels 
are readily detectable as for example, a certain 
detectable threshold does not need to be reached. 

In preferred methods of the invention, the set of 
probes of the invention are informative for a variety of 
different diseases, conditions or stages thereof. A 
sub-set of the probes disclosed herein may be used for 
diagnosis, identification or monitoring a particular 
disease, condition or stage thereof. Thus the probes 

may be used to diagnose or identify or monitor any 
condition, ailment, disease or reaction that leads to 
the relative increase or decrease in the activity of 
informative genes of any or all eukaryotic or 
prokaryotic organisms regardless of whether these 
changes have been caused by the influence of bacteria, 
virus, prions, parasites, fungi, radiation, natural or 
artificial toxins, drugs or allergens, including mental 
conditions due to stress, neurosis, psychosis or 
deteriorations due to the ageing of the organism, and 
conditions or diseases of unknown cause, providing a 
sub-set of the probes as described herein are 
informative for said disease or condition or stage 
thereof . 

Such diseases include those which result in 
metabolic or physiological changes, such as fever- 
associated diseases such as influenza or malaria. Other 
diseases which may be detected include for example 
yellow fever, sexually transmitted diseases such as 
gonorrhea, fibromyalgia, candida-related complex, cancer 
(for example of the stomach, lung, breast, prostate 
gland, bowel, skin, colon, ovary etc), Alzheimer's 
disease, disease caused by retroviruses such as HIV, 
senile dementia, multiple sclerosis and Creutzfeldt- 
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Jakob disease to mention a few. 

The invention may also be used to identify patients 
with psychiatric or psychosomatic diseases such as 
schizophrenia and eating disorders. Of particular 
importance is the use of this method to detect diseases, 
conditions, or stages thereof, which are not readily 
detectable by known diagnostic methods, such as HIV 
which is generally not detectable using known techniques 
1 to 4 months following infection. Conditions which may 
be identified include for example drug abuse, such as 
the use of narcotics, alcohol, steroids or performance 
enhancing drugs . 

Preferably said disease to be identified or 
monitored is a cancer or a degenerative brain disorder 
(such as Alzheimer's or Parkinson's disease). 

In particular, a set of oligonucleotide probes, 
wherein said set comprises at least 10 oligonucleotides 
selected from: 

an oligonucleotide as described in Table 4 or an 
oligonucleotide derived therefrom or an 
oligonucleotide with a complementary sequence, or a 
functionally equivalent oligonucleotide, 
may be used for diagnosis or identification or 
monitoring the progression of Alzheimer's disease. 
Similarly Table 2 probes and Table 2 derived probes and 
their functional equivalents may be used to diagnose, 
identify or monitor the progression of breast cancer. 
Especially preferably the probes used for breast cancer 
analysis are selected based on their occurrence as set 
forth in Table 3 and as described hereinbefore. 

The diagnostic method may be used alone as an 
alternative to other diagnostic techniques or in 
addition to such techniques. For example, methods of 
the invention may be used as an alternative or additive 
diagnostic measure to diagnosis using imaging techniques 
such as Magnetic Resonance Imagine (MRI), ultrasound 
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imaging, nuclear imaging or X-ray imaging, for example 
in the identification and/or diagnosis of tumours. 

The methods of the invention may be performed on 
cells from prokaryotic or eukaryotic organisms which may 
be any eukaryotic organisms such as human beings, other 
mammals and animals, birds, insects, fish and plants, 
and any prokaryotic organism such as a bacteria. 

Preferred non-human animals on which the methods of 
the invention may be conducted include, but are not 
limited to mammals, particularly primates, domestic 
animals, livestock and laboratory animals. Thus 
preferred animals for diagnosis include mice, rats, 
guinea pigs, cats, dogs, pigs, cows, goats, sheep, 
horses. Particularly preferably the disease state or 
condition of humans is diagnosed, identified or 
monitored . 

As described above, the sample under study may be 
any convenient sample which may be obtained from an 
organism. Preferably however, as mentioned above, the 
sample is obtained from a site distant to the site of 
disease and the cells in such samples are not disease 
cells, have not been in contact with such cells and do 
not originate from the site of the disease or condition. 

In such cases, although preferably absent, the sample 
may contain cells which do not fulfil these criteria. 
However, since the probes of the invention are concerned 
with transcripts whose expression is altered in cells 
which do satisfy these criteria, the probes are 
specifically directed to detecting changes in transcript 
levels in those cells even if in the presence of other, 
background cells. 

It has been found that the cells from such samples 
show significant and informative variations in the gene 
expression of a large number of genes. Thus, the same 
probe (or several probes) may be found to be informative 
in determinations regarding two or more diseases, 
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conditions or stages thereof by virtue of the particular 
level of transcripts binding to that probe or the 
interrelationship of the extent of binding to that probe 
relative to other probes. As a consequence, it is 
possible to use a relatively small number of probes for 
screening for multiple disorders or diseases. This has 
consequences with regard to the selection of probes, 
discussed in relation to random identification of probes 
hereinafter, but also for the use of a single set of 
probes for more than one diagnosis. Table 9 which 
represents preferred probes of the invention discloses 
probes which are informative for both Alzheimer's and 
breast cancer. 

Thus, the present invention also provides sets of 
probes for diagnosing, identifying or monitoring two or 
more diseases, conditions or stages thereof, wherein at 
least one of said probes is suitable for said 
diagnosing, identifying or monitoring at least two of 
said diseases, conditions or stages thereof, and kits 
and methods of using the same. Preferably at least 5 
probes, e.g. from 5 to 15 probes, are used in at least 
two diagnoses. 

Thus, in a further preferred aspect, the present 
invention provides a method of diagnosis or 
identification or monitoring as described hereinbefore 
for the diagnosis, identification or monitoring of two 
or more diseases, conditions or stages thereof in an 
organism, wherein said test pattern produced in step c) 
of the diagnostic method is compared in step d) to at 
least two standard diagnostic patterns prepared as 
described previously, wherein each standard diagnostic 
pattern is a pattern generated for a different disease 
or condition or stage thereof. 

Whilst in a preferred aspect the methods of 
assessment concern the development of a gene transcript 
pattern from a test sample and comparison of the same to 
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a standard pattern, the elevation or depression of 
expression of certain markers may also be examined by 
examining the products of expression and the level of 
those products. Thus a standard pattern in relation to 
the expressed product may be generated. 

In such methods the levels of expression of a set 
of polypeptides encoded by the gene to which an 
oligonucleotide of Table 1 or a Table 1 derived 
oligonucleotide, binds, are analysed. 

Various diagnostic methods may be used to assess 
the amount of polypeptides (or fragments thereof) which 
are present. The presence or concentration of 
polypeptides may be examined, for example by the use of 
a binding partner to said polypeptide (e.g. an 
antibody) , which may be immobilized, to separate said 
polypeptide from the sample and the amount of 
polypeptide may then be determined. 

"Fragments" of the polypeptides refers to a 
domain or region of said polypeptide, e.g. an antigenic 
fragment, which is recognizable as being derived from 
said polypeptide to allow binding of a specific binding 
partner. Preferably such a fragment comprises a 
significant portion of said polypeptide and corresponds 
to a product of normal post-synthesis processing. Thus 
in a further aspect the present invention provides a 
method of preparing a standard gene transcript pattern 
characteristic of a disease or condition or stage 
thereof in an organism comprising at least the steps of: 

a) releasing target polypeptides from a sample of 
one or more organisms having the disease or condition or 
stage thereof; 

b) contacting said target polypeptides with one or 
more binding partners, wherein each binding partner is 
specific to a marker polypeptide (or a fragment thereof) 
encoded by the gene to which an oligonucleotide of Table 
1 (or derived from a sequence described in Table 1) 
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binds, to allow binding of said binding partners to said 
target polypeptides, wherein said marker polypeptides 
are specific for said disease or condition thereof in an 
organism and sample thereof corresponding to the 
organism and sample thereof under investigation; and 

c) assessing the target polypeptide binding to said 
binding partners to produce a characteristic pattern 
reflecting the level of gene expression of genes which 
express said marker polypeptides, in the sample with the 
disease, condition or stage thereof. 

As used herein "target polypeptides" refer to those 
polypeptides present in a sample which are to be 
detected and "marker polypeptides" are polypeptides 
which are encoded by the genes to which Table 1 
oligonucleotides or Table 1 derived oligonucleotides 
bind. The target and marker polypeptides are identical 
or at least have areas of high similarity, e.g. epitopic 
regions to allow recognition and binding of the binding 
partner . 

"Release" of the target polypeptides refers to 
appropriate treatment of a sample to provide the 
polypeptides in a form accessible for binding of the 
binding partners, e.g. by lysis of cells where these are 
present. The samples used in this case need not 
necessarily comprise cells as the target polypeptides 
may be released from cells into the surrounding tissue 
or fluid, and this tissue or fluid may be analysed, e.g. 
urine or blood. Preferably however the preferred 
samples as described herein are used. "Binding 
partners" comprise the separate entities which together 
make an affinity binding pair as described above, 
wherein one partner of the binding pair is the target or 
marker polypeptide and the other partner binds 
specifically to that polypeptide, e.g. an antibody. 

Various arrangements may be envisaged for detecting 
the amount of binding pairs which form. In its simplest 
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form, a sandwich type assay e.g. an immunoassay such as 
an ELI SA, may be used in which an antibody specific to 
the polypeptide and carrying a label (as described 
elsewhere herein) may be bound to the binding pair (e.g. 
the first antibody : polypeptide pair) and the amount of 
label detected. 

Other methods as described herein may be similarly 
modified for analysis of the protein product of 
expression rather than the gene transcript and related 
nucleic acid molecules. 

Thus a further aspect of the invention provides a 
method of preparing a test gene transcript pattern 
comprising at least the steps of: 

a) releasing target polypeptides from a sample of 
said test organism; 

b) contacting said target polypeptides with one or 
more binding partners, wherein each binding partner is 
specific to a marker polypeptide (or a fragment thereof) 
encoded by the gene to which an oligonucleotide of Table 
1 (or derived from a sequence described in Table 1) 
binds, to allow binding of said binding partners to said 
target polypeptides, wherein said marker polypeptides 
are specific for said disease or condition thereof in an 
organism and sample thereof corresponding to the 
organism and sample thereof under investigation; and 

c) assessing the target polypeptide binding to said 
binding partners to produce a characteristic pattern 
reflecting the level of gene expression of genes which 
express said marker polypeptides, in said test sample. 

A yet further aspect of the invention provides a 
method of diagnosing or identifying or monitoring a 
disease or condition or stage thereof in an organism 
comprising the steps of: 

a) releasing target polypeptides from a sample of 
said organism; 

b) contacting said target polypeptides with one or 
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more binding partners, wherein each binding partner is 
specific to a marker polypeptide (or a fragment thereof) 
encoded by the gene to which an oligonucleotide of Table 
1 (or derived from a sequence described in Table 1) 
binds, to allow binding of said binding partners to said 
target polypeptides, wherein said marker polypeptides 
are specific for said disease or condition thereof in an 
organism and sample thereof corresponding to the 
organism and sample thereof under investigation; and 

c) assessing the target polypeptide binding to said 
binding partners to produce a characteristic pattern 
reflecting the level of gene expression of genes which 
express said marker polypeptides in said sample; and 

d) comparing said pattern to a standard diagnostic 
pattern prepared as described hereinbefore using a 
sample from an organism corresponding to the organism 
and sample under investigation to determine the degree 
of correlation indicative of the presence of said 
disease or condition or a stage thereof in the organism 
under investigation . 

The methods of generating standard and test 
patterns and diagnostic techniques rely on the use of 
informative oligonucleotide probes to generate the gene 
expression data. In some cases it will be necessary to 
select these informative probes for a particular method, 
e.g. to diagnose a particular disease, from a selection 
of available probes, e.g. the probes described 
hereinbefore (the Table 1 oligonucleotides, the Table 1 
derived oligonucleotides, their complementary sequences 
and functionally equivalent oligonucleotides) . The 
following methodology describes a convenient method for 
identifying such informative probes, or more 
particularly how to select a suitable sub-set of probes 
from the probes described herein. 

Probes for the analysis of a particular disease or 
condition or stage thereof, may be identified in a 
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number of ways known in the prior art, including by 
differential expression or by library subtraction (see 
for example W098/49342). As described hereinafter, in 
view of the high information content of most 
transcripts, as a starting point one may also simply 
analyse a random sub-set of mRNA or cDNA species and 
pick the most informative probes from that sub-set. The 
following method describes the use of immobilized 
oligonucleotide probes (e.g. the probes of the 
invention) to which mRNA (or related molecules) from 
different samples is bound to identify which probes are 
the most informative to identify a particular type of 
sample, e.g. a disease sample. 

The immobilized probes can be derived from various 
unrelated or related organisms; the only requirement is 
that the immobilized probes should bind specifically to 
their homologous counterparts in test organisms. Probes 
can also be derived from commercially available or 
public databases and immobilized on solid supports or, 
as mentioned above, they can be randomly picked and 
isolated from a cDNA library and immobilized on a solid 
support . 

The length of the probes immobilised on the solid 
support should be long enough to allow for specific 
binding to the target sequences. The immobilised probes 
can be in the form of DNA, RNA or their modified 
products or PNAs (peptide nucleic acids) . Preferably, 
the probes immobilised should bind specifically to their 
homologous counterparts representing highly and 
moderately expressed genes in test organisms. 
Conveniently the probes which are used are the probes 
described herein. 

The gene expression pattern of cells in biological 
samples can be generated using prior art techniques such 
as microarray or macroarray as described below or using 
methods described herein. Several technologies have now 
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been developed for monitoring the expression level of a 
large number of genes simultaneously in biological 
samples, such as, high-density oligoarrays (Lockhart et 
al., 1996, Nat. Biotech., 14, pl675-1680), cDNA 
microarrays (Schena et al, 1995, Science, 270, p467-470) 
and cDNA macroarrays (Maier E et al . , 1994, Nucl . Acids 
Res., 22, p3423-3424; Bernard et al . , 1996, Nucl. Acids 
Res. , 24, pl435-1442) . 

In high-density oligoarrays and cDNA microarrays, 
hundreds and thousands of probe oligonucleotides or 
cDNAs, are spotted onto glass slides or nylon membranes, 
or synthesized on biochips. The mRNA isolated from the 
test and reference samples are labelled by reverse 
transcription with a red or green fluorescent dye, 
mixed, and hybridised to the microarray. After washing, 
the bound fluorescent dyes are detected by a laser, 
producing two images, one for each dye. The resulting 
ratio of the red and green spots on the two images 
provides the information about the changes in expression 
levels of genes in the test and reference samples. 
Alternatively, single channel or multiple channel 
microarray studies can also be performed. 

In cDNA macroarray, different cDNAs are spotted on 
a solid support such as nylon membranes in excess in 
relation to the amount of test mRNA that can hybridise 
to each spot. mRNA isolated from test samples is radio- 
labelled by reverse transcription and hybridised to the 
immobilised probe cDNA. After washing, the signals 
associated with labels hybridising specifically to 
immobilised probe cDNA are detected and quantified. The 
data obtained in macroarray contains information about 
the relative levels of transcripts present in the test 
samples. Whilst macroarrays are only suitable to 
monitor the expression of a limited number of genes, 
microarrays can be used to monitor the expression of 
several thousand genes simultaneously and is, therefore, 
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a preferred choice for large-scale gene expression 
studies . 

A macroarray technique for generating the gene 
expression data set has been used to illustrate the 
probe identification method described herein. For this 
purpose, mRNA is isolated from samples of interest and 
used to prepare labelled target molecules, e.g. mRNA or 
cDNA as described above. The labelled target molecules 
are then hybridised to probes immobilised on the solid 
support. Various solid supports can be used for the 
purpose, as described previously. Following 
hybridization, unbound target molecules are removed and 
signals from target molecules hybridizing to immobilised 
probes quantified. If radio labelling is performed, 
Phospholmager can be used to generate an image file that 
can be used to generate a raw data set. Depending on 
the nature of label chosen for labelling the target 
molecules, other instruments can also be used, for 
example, when fluorescence is used for labelling, a 
Fluorolmager can be used to generate an image file from 
the hybridised target molecules. 

The raw data corresponding to mean intensity, 
median intensity, or volume of the signals in each spot 
can be acquired from the image file using commercially 
available software for image analysis. However, the 
acquired data needs to be corrected for background 
signals and normalized prior to analysis, since, several 
factors can affect the quality and quantity of the 
hybridising signals. For example, variations in the 
quality and quantity of mRNA isolated from sample to 
sample, subtle variations in the efficiency of labelling 
target molecules during each reaction, and variations in 
the amount of unspecif ic binding between different 
macroarrays can all contribute to noise in the acquired 
data set that must be corrected for prior to analysis. 

Background correction can be performed in several 
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ways. The lowest pixel intensity within a spot can be 
used for background subtraction or the mean or median of 
the line of pixels around the spots 1 outline can be used 
for the purpose. One can also define an area 
representing the background intensity based on the 
signals generated from negative controls and use the 
average intensity of this area for background 
subtraction . 

The background corrected data can then be 
transformed for stabilizing the variance in the data 
structure and normalized for the differences in probe 
intensity. Several transformation techniques have been 
described in the literature and a brief overview can be 
found in Cui, Kerr and Churchill 

http : / / www . j ax . org/ re search/ churchi 11 /research/ 
expression/Cui-Transf orm.pdf ) . Normalization can be 
performed by dividing the intensity of each spot with 
the collective intensity, average intensity or median 
intensity of all the spots in a macroarray or a group of 
spots in a macroarray in order to obtain the relative 
intensity of signals hybridising to immobilised probes 
in a macroarray. Several methods have been described 
for normalizing gene expression data (Richmond and 
Somerville, 2000, Current Opin. Plant Biol., 3, pl08- 
116; Finkelstein et al . , 2001, In "Methods of Microarray 
Data Analysis. Papers from CAMDA, Eds. Lin & Johnsom, 
Kluwer Academic, p57-68; Yang et al . , 2001, In "Optical 
Technologies and Informatics", Eds. Bittner, Chen, 
Dorsel & Dougherty, Proceedings of SPIE, 4266, pl41-152; 
Dudoit et al, 2000, J. Am. Stat. Ass., 97, p77-87; Alter 
et al 2000, supra; Newton et al . , 2001, J. Comp. Biol., 
8, p37-52). Generally, a scaling factor or function is 
first calculated to correct the intensity effect and 
then used for normalising the intensities. The use of 
external controls has also been suggested for improved 
normalization . 
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One other major challenge encountered in 
large-scale gene expression analysis is that of 
standardization of data collected from experiments 
performed at different times. We have observed that 
gene expression data for samples acquired in the same 
experiment can be efficiently compared following 
background correction and normalization. However, the 
data from samples acquired in experiments performed at 
different times requires further standardization prior 
to analysis. This is because subtle differences in 
experimental parameters between different experiments, 
for example, differences in the quality and quantity of 
mRNA extracted at different times, differences in time 
used for target molecule labelling, hybridization time 
or exposure time, can affect the measured values. Also, 
factors such as the nature of the sequence of 
transcripts under investigation (their GC content) and 
their amount in relation to the each other determines 
how they are affected by subtle variations in the 
experimental processes. They determine, for example, 
how efficiently first strand cDNAs, corresponding to a 
particular transcript, are transcribed and labelled 
during first strand synthesis, or how efficiently the 
corresponding labelled target molecules bind to their 
complementary sequences during hybridization. Batch to 
batch difference in the printing process is also a major 
factor for variation in the generated expression data. 

Failure to properly address and rectify for these 
influences leads to situations where the differences 
between the experimental series may overshadow the main 
information of interest contained in the gene expression 
data set, i.e. the differences within the combined data 
from the different experimental series. Figure 1 
provides one such example showing a classification based 
on Principal Component Analysis (PCA) of combined data 
from two experimental series where the main goal is to 
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distinguish between Alzheimer/non-Alzheimer patients . 

PCA (also known as singular value decomposition) is 
a technique for studying interdependencies and 
underlying relationships of a set of variables. The 
data are modelled in terms of a few significant factors 
or principal components (PC's), plus residuals. The 
PC's contain the main phenomena and define the 
systematic variability present in the data, while the 
residuals represent the variability interpreted as 
noise. Details on PCA can be found in Jollife (1986, 
Principal Component Analysis, Springer-Ver lag, NY) , and 
Jackson (1991, A User's Guide to Principal Components, 
Wiley, NY) . The results of Figure 1 show that two 
clusters are formed representing the data from two 
experimental series rather than the 

Alzheimer/ non-Al zheimer differentiation . There were 
eight samples in common between the two series of 
experiments, which ideally should have fallen on top of, 
or in near proximity to, each other if appropriately 
standardized . 

We have now found that gene expression data between 
different experiments can be efficiently standardized by 
including a subset of samples from one experimental 
series in the next experimental series and using a 
direct standardization method (DS) , originally described 
by Wang and Kowalski (Anal. Chem., 1991, 63, p2750 and 
J. Chemometrics, 1991, 5, pl29-145) . Although the 
method of DS is well known in the field of analytical 
chemistry, it remains undescribed and unused in the 
field of gene expression data analysis. 

In DS, the secondary data representing for example 
experimental series 2 (secondary measurements, R 2 ) are 
corrected to match the data measured on the primary 
measurements representing data from series 1 (Ri) , while 
the calibration model remains unchanged. In DS, 
response matrices for both experimental series are 
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related to each other by a transformation matrix F, i.e. 

Ri = R 2 F (1) 

Where F is a square matrix dimensioned gene by 
gene. From (1), the transformation matrix is calculated 
as : 



F= R 2 + R! (2) 

The transformation matrix F in equation (2) is 
calculated using a relatively small subset of samples 
which are measured on both the master primary and the 
secondary series of data. 

Finally, the response of the unknown sample 
measured on the secondary series r T 2 , un , is standardized 



to the response vector r i rUn expected from the primary 
series 



r T i,un — rT 2 i,unF (3) 



From the preceding equation it can be seen that the 
column i of the transformation matrix contains the 
multiplication factors for a set of genes measured in 
the secondary series to obtain the intensity at spot i 
of the corrected series. 

The number of samples that are repeated in the 
experimental series, Ri and R 2 , should be equal to their 
ranks, which in this case is equal to the number of 
principal components retained for explaining the 
variation in the Ri and R 2 . For example, if three 
principal components are retained for explaining the 
variation in the data set, a minimum of three samples 
should be repeated between Ri and R 2 . The samples that 
should be repeated between different series should 
ideally be those that exhibit high leverages in the gene 
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expression pattern. At times, two samples may suffice, 
while at other times, more than two samples should be 
ideally be included for good representativity . In some 
cases, the samples selected can be the same in all the 
experimental series to be compared (reference samples) , 
while in other cases, representative samples can be 
selected sequentially by analyzing the expression 
pattern after each experiment. The selected samples 
with high leverages are then included in the next 
experimental series. The results of using Direct 
Standardization are shown in Figure 1. 

Another approach for normalizing and standardizing 
the gene expression data set is to hybridize each DNA 
array with target molecules prepared from a test sample 
and an equal amount of labelled target molecules 
prepared from representative reference samples. In 
order to measure the intensity of labelled target 
molecules hybridizing to the immobilized probes it is 
necessary that the labelled molecules are prepared from 
test and reference samples using different labels, for 
example, different fluorescent dyes can be used for 
preparing the labelled material. The labelled molecules 
prepared from reference samples can be added to the 
hybridization solution together with the labelled 
material prepared from test samples. A data file from 
each array representing the expression pattern of 
different genes in the test sample and reference samples 
can then be obtained, normalized and standardized by the 
direct standardization method as described above. An 
instant advantage of including the differentially 
labelled target molecules from reference samples during 
hybridization is that it enables an efficient comparison 
of new test samples to the data sets already stored in a 
database . 

Monitoring the expression of a large number of 
genes in several samples leads to the generation of a 
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large amount of data that is too complex to be easily 
interpreted. Several unsupervised and supervised 
multivariate data analysis techniques have already been 
shown to be useful in extracting meaningful biological 
information from these large data sets. Cluster 
analysis is by far the most commonly used technique for 
gene expression analysis, and has been performed to 
identify genes that are regulated in a similar manner, 
and or identifying new/unknown tumour classes using gene 
expression profiles (Eisen et al . , 1998, PNAS, 95, 
pl4863-14868, Alizadeh et al . 2000, supra, Perou et al . 
2000, Nature, 406, p747-752; Ross et al, 2000, Nature 
Genetics, 24(3), p227-235; Herwig et al . , 1999, Genome 
Res., 9, pl093-1105; Tamayo et al, 1999, Science, PNAS, 
96, p2907-2912) . 

In the clustering method, genes are grouped into 
functional categories (clusters) based on their 
expression profile, satisfying two criteria: homogeneity 
- the genes in the same cluster are highly similar in 
expression to each other; and separation - genes in 
different clusters have low similarity in expression to 
each other. 

Examples of various clustering techniques that have 
been used for gene expression analysis include 
hierarchical clustering (Eisen et al . , 1998, supra; 
Alizadeh et al . 2000, supra; Perou et al . 2000, supra; 
Ross et al, 2000, supra), K-means clustering (Herwig et 
al., 1999, supra; Tavazoie et al, 1999, Nature Genetics, 
22(3), p. 281-285), gene shaving (Hastie et al . , 2000, 
Genome Biology, 1(2), research 0003.1-0003.21), block 
clustering (Tibshirani et al . , 1999, Tech repot Univ 
Stanford.) Plaid model (Lazzeroni, 2002, Stat. Sinica, 
12, p61-86), and self -organi zing maps (Tamayo et al . 
1999, supra). Also, related methods of multivariate 
statistical analysis, such as those using the singular 
value decomposition (Alter et al . , 2000, PNAS, 97(18), 
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plOlOl-10106; Ross et al . 2000, supra) or 
multidimensional scaling can be effective at reducing 
the dimensions of the objects under study. 

However, methods such as cluster analysis and 
singular value decomposition are purely exploratory and 
only provide a broad overview of the internal structure 
present in the data. They are unsupervised approaches 
in which the available information concerning the nature 
of the class under investigation is not used in the 
analysis. Often, the nature of the biological 
perturbation to which a particular sample has been 
subjected is known. For example, it is sometimes known 
whether the sample whose gene expression pattern is 
being analysed derives from a diseased or healthy 
individual. In such instances, discriminant analysis 
can be used for classifying samples into various groups 
based on their gene expression data. 

In such an analysis one builds the classifier by 
training the data that is capable of discriminating 
between member and non-members of a given class. The 
trained classifier can then be used to predict the class 
of unknown samples. Examples of discrimination methods 
that have been described in the literature include 
Support Vector Machines (Brown et al, 2000, PNAS, 97, 
p262-267), Nearest Neighbour (Dudoit et al . , 2000, 
supra), Classification trees (Dudoit et al . , 2000, 
supra), Voted classification (Dudoit et al . , 2000, 
supra), Weighted Gene voting (Golub et al . 1999, supra), 
and Bayesian classification (Keller et al . 2000, Tec 
report Univ of Washington) . Also a technique in which 
PLS (Partial Least Square) regression analysis is first 
used to reduce the dimensions in the gene expression 
data set followed by classification using logistic 
discriminant analysis and quadratic discriminant 
analysis (LD and QDA) has recently been described 
(Nguyen & Rocke, 2002, Bioinf ormatics , 18, p39-50 and 
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1216-1226) . 

A challenge that gene expression data poses to 
classical discriminatory methods is that the number of 
genes whose expression are being analysed is very large 
compared to the number of samples being analysed. 
However in most cases only a small fraction of these 
genes are informative in discriminant analysis problems. 
Moreover, there is a danger that the noise from 
irrelevant genes can mask or distort the information 
from the informative genes. Several methods have been 
suggested in literature to identify and select genes 
that are informative in microarray studies, for example, 
t-statistics (Dudoit et al, 2002, J. Am. Stat. Ass., 97, 
p77-87), analysis of variance (Kerr et al . , 2000, PNAS, 
98, p8961-8965), Neighbourhood analysis (Golub et al, 
1999, supra), Ratio of between groups to within groups 
sum of squares (Dudoit et al . , 2002, supra), Non 
parametric scoring (Park et al . , 2002, Pacific Symposium 
on Biocomputing, p52-63) and Likelihood selection 

(Keller et al . , 2000, supra). 

In the methods described herein the gene expression 
data that has been normalized and standardized is 
analysed by using Partial Least Squares Regression 

(PLSR) . Although PLSR is primarily a method used for 
regression analysis of continuous data (see Appendix A) , 
it can also be utilized as a method for model building 
and discriminant analysis using a dummy response matrix 
based on a binary coding. The class assignment is based 
on a simple dichotomous distinction such as breast 
cancer (class 1) / healthy (class 2), or a multiple 
distinction based on multiple disease diagnosis such as 
breast cancer (class 1) / Alzheimer (class 2) / healthy 

(class 3) . The list of diseases for classification can 
be increased depending upon the samples available 
corresponding to other diseases or conditions or stages 
thereof . 
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PLSR applied as a classification method is referred 
to as PLS-DA (DA standing for Discriminant analysis) . 
PLS-DA is an extension of the PLSR algorithm in which 
the Y-matrix is a dummy matrix containing n rows 
(corresponding to the number of samples) and K columns 
(corresponding to the number of classes) . The Y-matrix 
is constructed by inserting 1 in the Ath column and -1 
in all the other columns if the corresponding ith object 
of X belongs to class k. By regressing Y onto X, 
classification of a new sample is achieved by selecting 
the group corresponding to the largest component of the 
fitted, _(x) = (_i (x) , _ 2 (x) , . . . , _k(x)). Thus, in a -1/1 
response matrix, a prediction value below 0 means that 
the sample belongs to the class designated as -1, while 
a prediction value above 0 implies that the sample 
belongs to the class designated as 1. 

An advantage of PLSR-DA is that the results 
obtained can be easily represented in the form of two 
different plots, the score and loading plots. Score 
plots represent a projection of the samples onto the 
principal components and shows the distribution of the 
samples in the classification model and their 
relationship to one another. Loading plots display 
correlations between the variables present in the data 
set . 

It is usually recommended to use PLS-DA as a 
starting point for the classification problem due to its 
ability to handle collinear data, and the property of 
PLSR as a dimension reduction technique. Once this 
purpose has been satisfied, it is possible to use other 
methods such as Linear discriminant analysis, LDA, that 
has been shown to be effective in extracting further 
information, Indahl et al . (1999, Chem. and Intell. Lab. 
Syst., 49, pl9-31). This approach is based on first 
decomposing the data using PLS-DA, and then using the 
scores vectors (instead of the original variables) as 
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input to LDA. Further details on LDA can be found in 
Duda and Hart (Classification and Scene Analysis, 1973, 
Wiley, USA) . 

The next step following model building is of model 
validation. This step is considered to be amongst the 
most important aspects of multivariate analysis, and 
tests the "goodness" of the calibration model which has 
been built. In this work, a cross validation approach 
has been used for validation. In this approach, one or 
a few samples are kept out in each segment while the 
model is built using a full cross-validation on the 
basis of the remaining data. The samples left out are 
then used for prediction/classification. Repeating the 
simple cross-validation process several times holding 
different samples out for each cross-validation leads to 
a so-called double cross-validation procedure. This 
approach has been shown to work well with a limited 
amount of data, as is the case in some of the Examples 
described here. Also, since the cross validation step 
is repeated several times the dangers of model bias and 
overfitting are reduced. 

Once a calibration model has been built and 
validated, genes exhibiting an expression pattern that 
is most relevant for describing the desired information 
in the model can be selected by techniques described in 
the prior art for variable selection, as mentioned 
elsewhere. Variable selection will help in reducing the 
final model complexity, provide a parsimonious model, 
and thus lead to a reliable model that can be used for 
prediction. Moreover, use of fewer genes for the 
purpose of providing diagnosis will reduce the cost of 
the diagnostic product. In this way informative probes 
which would bind to the genes of relevance may be 
identified. 

We have found that after a calibration model has 
been built, statistical techniques like Jackknife 
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(Effron, 1982, The Jackknife, the Bootstrap and other 
resampling plans. Society for Industrial and Applied 
mathematics, Philadelphia, USA) , based on resampling 
methodology, can be efficiently used to select or 
confirm significant variables (informative probes) . 

The approximate uncertainty variance of the PLS 
regression coefficients B can be estimated by: 

M 

S 2 B = E ( (B-B m ) g) 2 
m=l 

where 

S 2 B = estimated uncertainty variance of B; 

B = the regression coefficient at the cross validated 

rank A using all the N objects; 

B m = the regression coefficient at the rank A using all 
objects except the object (s) left out in cross 
validation segment m; and 
g = scaling coefficient (here: g=l). 

In our approach, Jackknife has been implemented 
together with cross-validation. For each variable the 
difference between the B-coef f icients B± in a 
cross-validated sub-model and B to t for the total model is 
first calculated. The sum of the squares of the 
differences is then calculated in all sub-models to 
obtain an expression of the variance of the B± estimate 
for a variable. The significance of the estimate of B± 
is calculated using the t-test. Thus, the resulting 
regression coefficients can be presented with 
uncertainty limits that correspond to 2 Standard 
Deviations, and from that significant variables are 
detected . 

No further details as to the implementation or use 
of this step are provided here since this has been 
implemented in commercially available software, The 
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Unscrambler , CAMO ASA, Norway. Also, details on 
variable selection using Jackknife can be found in 
Westad & Martens (2000, J. Near Inf. Spectr., 8, pll7- 
124) . 

The following approach can be used to select 
informative probes from a gene expression data set: 

a) keep out one unique sample (including its 
repetitions if present in the data set) per cross 
validation segment ; 

b) build a calibration model (cross validated 
segment) on the remaining samples using PLSR-DA; 

c) select the significant genes for the model in 
step b) using the Jackknife criterion; 

d) repeat the above 3 steps until all the unique 
samples in the data set are kept out once (as described 
in step a) . For example, if 75 unique samples are 
present in the data set, 75 different calibration models 
are built resulting in a collection of 75 different sets 
of significant probes; 

e) select the most significant variables using 
the frequency of occurrence criterion in the generated 
sets of significant probes in step d) . For example, a 
set of probes appearing in all sets (100%) are more 
informative than probes appearing in only 50% of the 
generated sets in step d) . 

Once the informative probes for a disease have been 
selected, a final model is made and validated. The two 
most commonly used ways of validating the model are 
cross-validation (CV) and test set validation. In 
cross-validation, the data is divided into k subsets. 
The model is then trained k times, each time leaving out 
one of the subsets from training, but using only the 
omitted subset to compute error criterion, RMSEP (Root 
Mean Square Error of Prediction) . If k equals the 
sample size, this is called "leave-one-out" cross- 
validation. The idea of leaving one or a few samples 
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out per validation segment is valid only in cases where 
the covariance between the various experiments is zero. 
Thus, one sample at-a-time approach can not be justified 
in situations containing replicates since keeping only 
one of the replicates out will introduce a systematic 
bias in our analysis. The correct approach in this case 
will be to leave out all replicates of the same samples 
at a time since that would satisfy assumptions of zero 
covariance between the CV-segments. 

The second approach for model validation is to use 
a separate test-set for validating the calibration 
model. This requires running a separate set of 
experiments to be used as a test set. This is the 
preferred approach given that real test data are 
available . 

The final model is then used to identify a disease, 
condition or stage thereof in test samples. For this 
purpose, expression data of selected informative genes 
is generated from test samples and then the final model 
is used to determine whether a sample belongs to a 
diseased or non-diseased class or has a condition or 
stage thereof. 

Thus viewed from a yet further aspect the present 
invention provides a method of identifying probes useful 
for diagnosing or identifying or monitoring a disease or 
condition or stage thereof in an organism, comprising 
the steps of: 

a) immobilizing a set of oligonucleotide probes, 



preferably as described hereinbefore, on a 
solid support; 



b) 



isolating mRNA from a sample of a normal 



organism (normal sample) , which may optiona 
be reverse transcribed to cDNA; 



lly 



c) 



isolating mRNA from a sample from an organi 
corresponding to the sample and organism of 
step (b) , which is known to have said disea 



sm, 



se 
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or condition or a stage thereof (diseased 
sample) , which may optionally be reverse 
transcribed to cDNA; 

d) hybridizing the mRNA or cDNA of steps (b) and 
(c) to said set of immobilized oligonucleotide 

probes of step (a) ; and 

e) assessing the amount of mRNA or cDNA 
hybridizing to each of said oligonucleotide 
probes to determine the level of gene 
expression of genes to which said 
oligonucleotide probes bind in said normal and 
diseased samples to generate a gene expression 
data set for each sample; 

f) normalizing and standardizing said data set of 
step (e) ; 

g) constructing a calibration model for 
classification, preferably using the 
statistical techniques Partial Least Squares 
Discriminant Analysis (PLS-DA) and Linear 
Discriminant Analysis (LDA) ; 

h) performing JackKnife analysis and identifying 
those oligonucleotide probes which are 
required for classification of said disease 
and normal samples into their respective 
groups . 

Preferably a model for classification purposes is 
generated by using the data relating to the probes 
identified according to the above described method. 
Preferably the sample is as described previously. 
Preferably the oligonucleotides which are immobilized in 
step (a) are randomly selected as described below or are 
the probes as described hereinbefore. Such 
oligonucleotides may be of considerable length, e.g. if 
using cDNA (which is encompassed within the scope of the 
term "oligonucleotide"). The identification of such 
cDNA molecules as useful probes allows the development 
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of shorter oligonucleotides which reflect the 
specificity of the cDNA molecules but are easier to 
manufacture and manipulate. 

The above described model may then be used to 
generate and analyse data of test samples and thus may 
be used for the diagnostic methods of the invention. In 
such methods the data generated from the test sample 
provides the gene expression data set and this is 
normalized and standardized as described above. This is 
then fitted to the calibration model described above to 
provide classification. 

The method described herein can also be used to 
simultaneously select informative probes for several 
related and unrelated diseases or conditions. Depending 
upon which diseases or conditions have been included in 
the calibration or training set, informative probes can 
be selected for the said diseases or conditions. The 
informative probes selected for one disease or condition 
may or may not be similar to the informative probes 
selected for another disease or condition of interest. 
It is the pattern with which the selected genes are 
expressed in relation to each other during a disease, 
condition, or stage thereof, that determines whether or 
not they are informative for the disease, condition or 
stage thereof. 

In other words, informative genes are selected 
based on how their expression correlates with the 
expression of other selected informative genes under the 
influence of responses generated by the disease, 
condition or stage thereof under investigation. In 
examples 1 and 2 provided hereinafter, 139 informative 
probes were selected for breast cancer diagnosis and 182 
probes were selected for Alzheimer's disease diagnosis 
by training the gene expression data set of genes 
representing 1435 or 758 randomly picked cDNA clones for 
breast cancer/non breast cancer samples, or 



- 56 - 

Marked-Up Copy 
Alzheimer/non- Alzheimer samples, respectively . Among 
the probes selected for breast cancer and Alzheimer, 
about 10 probes were informative both for breast cancer 
and Alzheimer disease diagnosis. 

For the purpose of isolating informative probes or 
identifying several related and unrelated diseases, 
conditions and stages thereof simultaneously, the gene 
expression data set must contain the information on how 
genes are expressed when the subject has a particular 
disease, condition or stage thereof under investigation. 

The data set is generated from a set of healthy or 
diseased samples, where a particular sample may contain 
the information of only one disease, condition or stages 
thereof or may also contain information about multiple 
diseases, conditions or stages thereof. For example, if 
the isolation of informative probes for Alzheimer 
disease, breast cancer and diabetes is sought, whole 
blood samples can be obtained from an Alzheimer patient 
who has breast cancer and diabetes. Hence, the method 
also teaches an efficient experimental design to reduce 
the number of samples required for isolating informative 
probes by selecting samples representing more than one 
disease, condition or stage thereof. 

As mentioned previously, in view of the high 
information content of most transcripts, the 
identification and selection of informative probes for 
use in diagnosing, monitoring or identifying a 
particular disease, condition or stage thereof may be 
dramatically simplified. Thus the pool of genes from 
which a selection may be made to identify informative 
probes may be radically reduced. 

Unlike, in prior art technologies where informative 
probes are selected from a population of thousands of 
genes that are being expressed in a cell, like in 
microarray, in the method described herein, the 
informative probes are selected from a limited number of 
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randomly obtained genes. For example, from a population 
of 1435 cDNA clones, randomly picked from a human whole 
blood cDNA library, we were able to select 139 
informative probes for breast cancer diagnosis (see 
Example 1 and Table 2) . 

Thus in a preferred aspect of the above mentioned 
method of identifying probes useful for diagnosing or 
identifying or monitoring a disease or condition or 
stage thereof in an organism, said set of 
oligonucleotides which are immobilized in step (a) are 
randomly selected from a larger set of oligonucleotides, 
e.g. from a cDNA library or other oligonucleotide pool, 
which may be, but is preferably not selected from the 
set provided herein. Preferably said larger set 
comprises oligonucleotides which correspond to 
moderately or highly expressed genes. Thus preferably 
in methods of the invention, the set of oligonucleotides 
according to the invention are replaced with a set of 
oligonucleotides which are randomly selected, e.g. from 
commercially available oligonucleotide or cDNA 
libraries . 

As referred to herein "random" refers to selection 
which is not biased based on the extent of information 
carried by the transcripts in relation to the disease, 
condition or organism under study, ie. without bias 
towards their likely utility as informative probes. 
Whilst a random selection may be made from a pool of 
transcripts (or related products) which have been 
biased, e.g. to highly or moderately expressed 
transcripts, preferably random selection is made from a 
pool of transcripts not biased or selected by a 
sequence-based criterion. The larger set may therefore 
contain oligonucleotides corresponding to highly and 
moderately expressed genes, or alternatively, may be 
enriched for those corresponding to the highly and 
moderately expressed genes. 
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Random selection from highly and moderately 
expressed genes can be achieved in a wide variety of 
ways. A strategy used in this work, but not limiting in 
itself involves randomly picking a significant number of 
cDNA clones from a cDNA library constructed from a 
biological specimen under investigation. Since, in a 
cDNA library, the cDNA clones corresponding to 
transcripts present in high or moderate amount are more 
frequently present than transcripts corresponding to 
cDNA present in low amount, the former will tend to be 
picked up more frequently than the latter. A pool of 
cDNA enriched for those corresponding to highly and 
moderately expressed genes can be isolated by this 
approach . 

To identify genes that are expressed in high or 
moderate amount among the isolated population for use in 
methods of the invention, the information about the 
relative level of their transcripts in samples of 
interest can be generated using several prior art 
techniques. Both non-sequence based methods, such as 
differential display or RNA fingerprinting, and 
sequence-based methods such as microarrays or 
macroarrays can be used for the purpose. Alternatively, 
specific primer sequences for highly and moderately 
expressed genes can be designed and methods such as 
quantitative RT-PCR can be used to determine the levels 
of highly and moderately expressed genes. Hence, a 
skilled practitioner may use a variety of techniques 
which are known in the art for determining the relative 
level of mRNA in a biological sample. 

Especially preferably the sample for the isolation 
of mRNA in the above described method is as described 
previously and is preferably not from the site of 
disease and the cells in said sample are not disease 
cells and have not contacted disease cells. 

The following examples are given by way of 
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illustration only in which the Figures referred to are 
as follows: 

Figure 1 shows the effect of Direct Standardization 
(DS) on the Alzheimer data measured in two different 
series of experiments in which AD denotes Alzheimer's 
samples and A, B are non-Alzheimer 1 s samples. The 
samples in both series have been labelled systematically 
as (xx_7/xx_8), whereas the corrected samples from 
series 8 (in b,c,d) have been labelled as (xx_c) , thus, 
for example, AD2-7 denotes Alzheimer disease sample 
number 2 in experiment series 7 . The circled spots 
represent the samples chosen as the transfer samples. 
The connecting lines in figures b,c,d show the proximity 
of the replicated samples after applying DS . The dashed 
lines in figures a,c,d represent the decision boundary 
separating the classes. These lines have not been drawn 
on the basis of any statistical criteria, but serve the 
purpose of visually separating the classes. All the 
four figures show scores plot (PC1-PC2) from PCA 
analysis based on (a) non-standardized data, (b) scores 
plot after direct standardization using 3 transfer 
samples, (c) scores plot after direct standardization 
using 4 transfer sample, (d) scores plot after direct 
standardization using 8 transfer samples; 

Figure 2 shows the projection of normal (including 
benign) and breast cancer samples onto a classification 
model generated by PLSR-DA using the data of 44 
informative genes, in which PC is the principal 
components and N and C are normal and breast cancer 
samples , respectively; 

Figure 3 shows the projection of individuals with 
and without Alzheimer's disease onto a classification 
model generated by PLSR-DA using 182 informative genes; 

Figures 4, 6 and 8 show projection plots as Figure 
2 in which the classification model is generated using 
719, 111 and 345 cDNAs, respectively, wherein PC is the 
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principal components, N denotes normal and B denotes 
breast cancer samples; 

Figures 5, 7 and 9 show prediction plots based on 3 
principal components using the data of 719, 111 and 345 
cDNAs , respectively; 

Figure 10 shows a projection plot as Figure 3 in 
which the classification model is generated using 520 
cDNAs; and 

Figure 11 is the prediction plot corresponding to 
Figure 1 0 . 

Example 1: Diagnosis of Breast Cancer 
Methods 

Whole blood was obtained from the arms of breast cancer 
patients and patients with benign tumours (Ulleval and 
Haukland hospitals in Norway) . All of the patients with 
breast cancer had a malignant tumour of the breast 
(disease samples) . Healthy blood was collected from the 
above two hospitals, or collected at a Health station at 
As, Norway or at DiaGenic AS, Norway, from the arms of 
female donors with no reported signs of breast cancer. 
The blood from healthy individuals or with benign 
tumours comprise the normal samples. The blood was 
either collected in tubes containing EDTA and stored 
immediately at -80°C or was collected in PAXgene tubes 
and stored for 12-24 hours at room temperature before 
finally storing them at -80°C before use. Further 
details of the breast cancer and benign tumour patients 
from which blood was taken is provided in Table 5. 

mRNA was isolated from the blood of the 29 breast cancer 
patients and 46 normal donors and used to prepare 
labelled probes by reverse transcribing in the presence 
of a 33 P-dATP. The first strand cDNA of the normal and 
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diseased samples was bound, separately to 1435 cDNA 
clones immobilized on a solid support (nylon membrane) . 

These cDNA clones were randomly picked, without any 
prior knowledge of their gene sequences, from a cDNA 
library constructed using whole blood of 550 healthy 
individuals (Clontech, Palo Alto, USA) . These methods 
were conducted as follows. 

For amplification of inserts, bacterial clones were 
grown in microtiter plates containing 150 \xl LB with 50 
(ig/ml carbenicillin, and incubated overnight with 
agitation at 37°C. To lyse the cells, 5 \il of each 
culture were diluted with 50 \xl H20 and incubated for 12 
min. at 95°C. Of this mixture, 2 [il were subjected to a 
PCR reaction using 20 pmoles of M13 forward and reverse 
primer in presence of 1.5 mM MgCl 2 . PCR reactions were 
performed with the following cycling protocol: 4 min. at 
95°C, followed by 25 cycles of 1 min. at 94°C, 1 min. at 
60°C and 3 min. at 72°C either in a RoboCycler® 
Temperature Cycler (Stratagene, La Jolla, USA) or DNA 
Engine Dyad Peltier Thermal Cycler (MJ Research Inc., 
Waltham, USA) . The amplified products were denatured by 
incubating with NaOH (0.2 M, final concentration) for 30 
min. and spotted onto Hybond-N+ membranes (Amersham 
Pharmacia Biotech, Little Chalfont, UK) , using MicroGrid 
II workstation according to the manufacturer's 
instructions (BioRobotics Ltd, Cambridge England) . The 
immobilized cDNAs were fixed using a UV cross-linker 
(Hoefer Scientific Instruments, San Francisco, USA) . 

In addition to the 1435 cDNAs, the printed arrays also 
contained controls for assessing background level, 
consistency and sensitivity of the assay. These were 
spotted at multiple positions and included controls such 
as PCR mix (without any insert) ; positive and negative 
controls of SpotReportTM 10 array validation system 
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(Stratagene, La Jolla, USA) and cDNAs corresponding to 
constitutively expressed genes such as b-actin, g-actin, 
GAPDH, HOD and cyclophilin. Also, oligonucleotides 
corresponding to SIX1, b-tubulin, TRP-2, MDM2 , Myosin 
Light C, CD44, Maspin, Laminin, and SRP 19 were included 
to detect disseminated cancer cells. 

The total RNA from blood collected in EDTA tubes was 
purified using Trizol LS Reagent protocol 

( Invitrogen/Lif e Technologies). From blood contained in 
PAXgene tubes, the total RNA was purified according to 
the supplier's instructions ( PreAnalyt iX, Hombrechtikon, 
Switzerland) . Contaminating DNA was removed from the 
isolated RNA by DNAase I treatment using DNA-free kit 

(Ambion, Inc. Austin, USA). RNA quality was determined 
visually by inspecting the integrity of 28S and 18S 
ribosomal bands following agarose gel electrophoresis. 
The concentration and purity of extracted RNA was 
determined by measuring the absorbance at 260 nm and 280 
nm. mRNA was isolated from the total RNA using Dynabeads 
as per the supplier's instructions (Dynal AS, Oslo, 
Norway) . 

Labelling and hybridization experiments were performed 
in batches. The number of samples assayed in each batch 
varied from six to nine. In the case of samples that 
were assayed more than once (replicates) , aliquots 
derived from the same mRNA pool were used for probe 
synthesis. For probe synthesis, aliquots of mRNA 
corresponding to 4-5 \iq of total RNA were mixed together 
with oligodT 2 5Nv (0.5 |xg/ml) and mRNA spikes of 
SpotReport™ 10 array validation system (10 pg; Spike 2, 
1 pg) , heated to 70°C to remove secondary structures, and 
then chilled on ice. Probes were prepared in 35jil 
reaction mixes by reverse transcription in the presence 
of 50|aCi [a 33 P] dATP, 3 . 5 \M dATP, 0.6 mM each of dCTP, 
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dTTP, dGTP, 200 units of Superscript reverse 
transcriptase (Invitrogen, Lif eTechnologies ) and 0.1 M 
DTT, labelling for 1.5 hr at 42°C. Following synthesis, 
the enzyme was deactivated for 10 min. at 70°C and mRNA 
removed by incubating the reaction mix for 20 min. at 
37°C in 4 units of Ribo H (Promega, Madison USA) . 
Unincorporated nucleotides were removed using ProbeQuant 
G 50 Columns (Amersham Biosciences, Piscataway, USA) . 

Prior to hybridization, the membranes were equilibrated 
in 4 x SSC for 2 hr at room temperature and 

prehybridized overnight at 65°C in 10 ml prehybridisation 
solution (4 x SSC, 0.1 M NaH 2 P0 4 , 1 mM EDTA, 8% dextran 
sulphate, 10 x denhardt 1 s solution, 1% SDS) . Freshly 
prepared probes were added to 5 ml of the same 
prehybridisation solution, and hybridization continued 
overnight at 65°C. The membranes were washed at 65°C at 
increasing stringency (2 x 30 min. each in 2 x SSC, 0.1% 
SDS; 1 x SSC, 0.1% SDS; 0.1 x SSC, 0.1% SDS) to remove 
unspecif ic signals . 

The amount of labelled first strand cDNA binding to each 
spot was assessed and quantified using a Phospholmager 
to generate a gene expression data set. The data was 
generated using Phoretix software version 3 (Non Linear 
Dynamics, England) . Background subtraction was 
performed on the generated data by subtracting the 
median of the line of pixels around each spot outline 
from the total intensity obtained from the respective 
spots . 

The background-subtracted data was then normalized and 
transformed by selecting out 50 lowest and 50 maximum 
signals from each membrane. This step was to exclude 
genes that were expressed with a high degree of 
variance. Since the genes varied from membrane to 
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membrane, the expression data from 497 genes were 
removed from the data set. The values for the remaining 
938 genes were then normalised by using different 
approaches such as external controls, dividing each spot 
by the median intensity of the observed signal in the 
respective membrane, range normalizing the data from 
each membrane, and then log transforming the data 
obtained . 



The processed data obtained above was then used to 
isolate the informative probes by: 

a) keeping one unique sample (including all 
repetitions of the selected sample) out per cross 
validation segment ; 

b) building a calibration model (cross validated) 
on the remaining samples using PLSR-DA; 

c) selecting the set of significant genes for the 
model in step b using the Jackknife criterion; 

d) repeating steps a) , b) and c) until all the 
unique samples were kept out once (hence, in all 75 
different calibration models were built (after repeating 
step b) 75 times) , resulting in 75 different sets of 
significant probes (after repeating step c) 75 times) ) ; 

e) selecting significant variables using the 
frequency of occurrence criterion amongst the 75 
different sets of significant probes. 



The selected informative probes based on occurrence 
criterion were used to construct a classification model. 
The result of the classification model based on probes 
appearing in at least 90% of the generated sets after 
the step of isolating informative probes as described 
above is shown in Figure 2 in which it is seen that the 
expression pattern of these genes was able to classify 
most women with breast cancer and women with no breast 
cancer into distinct groups. In this figure PCI and PC2 



- 65 - 

Marked-Up Copy 
indicate the two principal components statistically 
derived from the data which best define the systemic 
variability present in the data. This allows each 
sample, and the data from each of the informative probes 
to which the sample's labelled first strand cDNA was 
bound, to be represented on the classification model as 
a single point which is a projection of the sample onto 
the principal components - the score plot. 

The ability of the generated model, based on isolated 
informative probes, to predict future samples was 
determined by the double cross-validation approach. The 
performance of the diagnostic test for breast cancer 
based on the occurrence criterion is presented in Table 
6. 

Correct prediction of most breast cancer cells was 
achieved. These included all three samples obtained 
from women with ductal carcinoma in situ (DCIS) , 11/15 
samples obtained from women with stage I breast cancer, 
all five samples obtained from women with stage II 
breast cancer, and one of two samples obtained from 
women with stage III breast cancer. Interestingly, two 
correctly predicted stage I samples were obtained from 
women having a tumour size of <5 mm in diameter. 

The model also correctly predicted the class of most 
non-cancer samples (41/46), including those that were 
obtained from women with non-cancerous breast 
abnormalities . 

Confirmation that the gene transcripts are not from 
cells which are disseminated disease cells has been 
confirmed by several lines of evidences. Firstly, the 
informative genes were expressed const i tut ively at high 
or moderate levels in blood cells of women irrespective 
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of whether they had cancer or not. Secondly, in the 
assay described in this Example, in order to identify 
transcripts, at least 720 disseminated cells in blood 
samples would be required. Since, the average number of 
disseminated cells present in blood during different 
stages of breast cancer is much lower (organ confined 
breast cancer, 0.8 cells per ml; invasive breast cancer 
spread to lymph nodes only, 2.4 cells per ml; and 
metastatic breast cancer, 6 cells per ml; SD>100%) (29) , 
we believe that the signals being detected originated 
from peripheral blood cells and could not have 
originated from disseminated cells. Thirdly, we were 
not able to detect any signal from the eight cancer 
markers known to have elevated expression in malignant 
cancer cells, including cancer cells that are 
disseminated in the blood. 

Example 2: Diagnosis of Alzheimer 7 s disease 

Similar experiments were conducted with samples from 
Alzheimer's patients. In this method 7 patients 
diagnosed with Alzheimer's Disease at the Memory Clinic 
at Ulleval University Hospital were used in the trial. 
The patients were confirmed as having Alzheimer's 
disease based on the following criteria: 

* A standardized interview with a care-giver using 
IQCODE, an ADL scale and a scale measuring 
behaviour of the patient (Green scale) . 

* Neuropsychological evaluation using MMSE, Clock 
drawing test, Trailmaking test A and B (TMT A and 
B) , Kendrick object learning test (visual memory 
test) , part of the Wechsler battery and Benton 
test . 

* A psychiatric evaluation using scales for detection 
of depression, MADRS for interviewing the patient 
and Cornell scale for interviewing the care-giver. 
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* A physical examination. 

* Laboratory tests of blood samples to rule out other 
diseases . 

* CT scan of the brain. 

* SPECT of the brain. 

The mean age of the patients was 72.3 with an age range 
of 69-76. The mean MMSE score was 22.0 (the maximum 
score attainable being 30). 

Six age-matched individuals without diagnosed 
Alzheimer's disease were used as a control. All had 
been tested with MMSE and had a minimum score of 28 
(mean: 28.4) . The mean age of the normal control group 
was 73.0 and the age range 66-81. A sample from a 16- 
year old individual, with a consequent minimal chance of 
having Alzheimer's disease, was also included as an 
additional control . 

Using the methods described above (except that 
hybridization to 758 rather than 1435 cDNA clones was 
performed) , informative probes were selected based on 
occurrence criterion and used to construct a 
classification model. The results of the classification 
model based on probes appearing at least once in the 
generated sets after the method to isolate informative 
probes as described above is shown in Figure 3 in which 
it will be seen that the expression pattern of these 
genes was able to classify individuals with or without 
Alzheimer's disease into distinct groups. In this 
Figure PCI and PC2 indicate the 2 principal components 
statistically derived from the data which define the 
systematic variability present in the data. This allows 
each sample, and the data from each of the informative 
probes to which the samples' cDNA was bound, to be 
represented on the classification model as a single 
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point which is a projection of the sample onto the 
principal components - the score plot. 



The ability of the generated model, based on isolated 
informative probes, to predict future samples was 
determined by the double cross-validation. The 
performance of the diagnostic test for Alzheimer's 
disease is presented in Table 7. 



Appendix A 



- 69 - 



Marked-Up Copy 



Partial Least Squares regression (PLSR) 



Let a multivariate regression model be defined as: 



Y = XB + F 



where 

X a NxP matrix with N predictor variables (genes) ; 

Y (NxJ) being the J predicted variables. In our case Y 

represents a matrix containing dummy variables; 

B is a matrix of regression coefficients; and 

F is a NxJ matrix of residuals. 



The structure of the PLSR model can be written as: 



X = TP T + E A , and 
Y = TQ T + F A/ where 



where 

T (NxA) is a matrix of score vectors which are linear 
combinations of the x-variables; 

P (PxA) is a matrix with the x-loading vectors p a as 
columns ; 

Q (JxA) is a matrix with the y-loading vectors q a as 
columns ; 

E a (NxP) is the matrix for X after A factors; and 
F a (NxJ) is the matrix for Y after A factors. 



The criterion in PLSR is to maximize the explained 
covariance of [X,Y] . This is achieved by the loading 
weights vector w a +i, which is the first eigenvector of 
E a T F a F a T E a (E a and F a are the deflated X and Y after a 
factors or PLS components) . 
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The regression coefficients are given by: 
B = W (P T W) _1 Q T 



A PLSR model with full rank, i.e. maximum number of 
components, is equivalent to the MLR solutions. Further 
details on PLSR can be found in Marteus & Naes, 1989, 
Multivariate Calibration, John Wiley & Sons, Inc., USA 
and Kowalski & Seasholtz, 1991, supra. 
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Example 3: Validation of Example 1, diagnosis of breast 
cancer 

The results in Example 1 were validated by using the 
informative probes identified in Example 1 on new beast 
cancer and control samples. 

Methods 

The methods, essentially as described in Example 1, were 
used. Blood was taken from patients as described in 
Table 8. However, blood was collected in PAXgene tubes 
and the first strand labelled cDNAs were hybridized to 
719 cDNAs spotted on nylon membranes along with other 
controls as described in Example 1. After background 
subtraction using control spots, the data of each 
membrane was normalized using the inter quantile range. 

The data was analysed as described in Example 1 and the 
model validated by cross validation. 

The 719 cDNAs which were spotted are a subset of the 
cDNAs spotted in Example 1 and include 111 cDNAs 
described in Table 2 and which were found to be 
informative in Example 1 . 

Results 

The results are shown in Figures 4 to 9. Figures 4, 6 
and 8 are projection plots similar to Figure 2 and show 
the projection of normal and breast cancer patients' 
samples onto a classification model generated using all 
719 cDNA. Figure 6 is similar but uses a classification 
model generated with the 111 probes common to Example 1. 

Figure 8 uses the 345 sequences of the 719 for which 
sequence information is provided herein. In each case 
classification of normal and breast cancer groups was 
possible. Figures 5, 7 and 9 show prediction plots 
which reflect the ability of the generated models to 
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correctly diagnose breast cancer. In the 3 prediction 
plots shown, the disease samples appear on the x axis at 
+1 and the non-disease samples appear at -1. The y axis 
represents the predicted class membership. During 
prediction, if the prediction is correct, disease 
samples should fall above zero and non-disease samples 
should fall below zero. In each case almost all samples 
are correctly predicted. 

Example 4: Validation of Example 2, diagnosis of 
Alzheimers 

The results in Example 2 were validated by using the 
informative probes identified in Example 2 on new 
Alzheimer's patient samples. 

Methods 

The methods, essentially as described in Example 2, were 
used. Twelve female patients diagnosed with Alzheimer's 
disease at the Memory Clinic at Ulleval University 
Hospital who were confirmed as having Alzheimer's 
disease based on the criteria of Example 2 were used in 
the trial. The mean age of the patients was 72.3 with 
an age range of 66-83. The mean MMSE score was 22.0 
(the maximum score attainable being 30). 

Sixteen age-matched female individuals without diagnosed 
Alzheimer's disease were used as the normal control 
group. All had been tested with MMSE and had a minimum 
score of 29. The mean age of the normal control group 
was 74.0 and the age range 66-86. 

After transfer of the blood to PAXgene tubes, total mRNA 
was isolated from the blood of the Alzheimer's disease 
and from the control group donors according to the 
manufacturers ' s instructions ( PreAnalyt iX, 
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Hombrechtikon, Switzerland) . The isolated mRNA was 
labelled during reverse transcription in the presence of 
a 33 P-dATP, yielding a labelled first strand cDNA. 
Hybridization was performed as described previously onto 
730 cDNA clones picked from a cDNA library from whole 
blood of 550 healthy individuals without knowledge of 
the gene sequence of the random cDNA clones. 

Results 

The results are shown in Figures 10 and 11. Figure 10 
is a projection plot generated using 520 probes which 
have been sequenced. Figure 11 is a prediction plot and 
shows correct prediction of almost all samples. 
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List of probes informative for disease diagnosis 
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111-40 


499 


349 


132 


++37 


111-43 


4++ 


382 


500 


++3 8 


111-44 


4++ 


382 


134 


++3 9 


111-53 


g Q Q 


390 


142 


+8-4 0 


111-56 


5 Q 3 


109 


144 


++41 


111-57 


5 Q 4 


374 


145 


++ 


III 60 




3 2 5 
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■44 


III 6 0 


- 






4442 


111-61 


444 


521 


148 


4443 


111-63 


5 q 3 


575 


150 


44 


III 68 


- 


- 




444 4 


111-74 




502 


155 


444 5 


111-80 




585 


158 


^ L 


III 82 


- 


- 




4446 


ni-85 


444 




161 


444 7 


111-89 


444 




165 


44 


III 92 


- 


- 




44 


III 96 


- 


- 




444 8 


IV- 14 


594 


545 


275 


4449 


IV- 15 


1185 


628 


402 


44 


IV 2 3 


- 


- 




4450 


IV- 2 6 


4444 




403 


44 


IV 2 6 


- 


- 




44 


IV 2 9 


- 


- 




4451 


IV- 31 


g Q "7 


268 


278 


4452 


IV- 3 2 


599 


569 


279 


44 


IV 3 4 


- 


- 




44 


IV 3 5 


- 


- 




44 


IV 41 


- 


- 




44 


IV 4 5 


- 


- 




4453 


IV- 5 3 


44 


362 


498 


44 


IV 62 


- 


- 




445 4 


IV- 6 9 


444 


286 


4 


4455 


IV- 8 0 


444 


579 


291 


44 


IV 8 2 


- 


- 




44 


IV 93 








4456 


IX-10 


735 


641 


314 


44 


IX 12 








445 7 


IX-38 


444 


583 


317 


4458 


IX-39 




424 


318 
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#4 


IX 4 2 


- 


- 






IX-48 


444 


626 


319 


4460 


IX-77 


7 3 5 


556 


325 


44 


T7 Q 2_ 




- 






Y g 2 


- 


- 






V-03 






296 


444462 


V-0 4 


4444 


■ 


297 


46* 




- 


- 






V-07 


4448- 


293 


298 


444464 


V-ll 


444843- 


599 


404 


4444-65 


V-12 


444 


498 


301 


40# 


V 15 


- 


- 




IPS 


V 17 


- 


- 






V 21 


- 


- 




]_ p o 


Y 2 5 


- 


- 








- 


- 




444 


Y 3 5 


- 


- 




iii 


¥—44 


- 


- 




4443- 


V 4 2 


- 


- 




44t3 


V 4 3 


- 


- 




44^4 


V 4 7 


- 


- 




4444 


V 4 9 


- 


- 




4444 


V 5 2 


- 


- 




4-4-4 


V 5 4 


- 


- 




444466 


V-55 


44 




499 


4449- 


V 5 8 


- 


- 




2_ 2 o 


V 5 9 


- 


- 




4444 


Y g 5 


- 


- 




TOO 

LiLiL 


Y g g 








12 3 


V 71 








4444 


V 7 5 










V 7 9 








444467_ 


V-80 


-7 2 (5 


260 


311 
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447- 


¥-44 


- 


- 






4-44 


- 


- 






4-44 


- 


- 




2_ c< 


4-44 


- 


- 






VI 0 2 


- 


- 




44468 


VI-04 


444 


122 


339 


444_69 


VI-07 


44 


405 


1 




V4 — 44 


- 


- 








- 


- 




4447C^ 


VI-12 


o g g 


667 


341 




VI-14 


444 


642 


343 


2_ 3 o 


VI 17 


- 


- 




44472 


VI-20 






346 


444 


VI 21 


- 


- 




4447^ 


VI-23 


o n Q 


634 


347 


444 


VI 3 4 


- 


- 




444 


VI 41 


- 


- 






VI 4 2 


- 


- 




444 


VI 4 3 


- 


- 




444 


VI 4 4 


- 


- 




44474 


VI-48 


444 




355 


444 


VI 4 9 


- 


- 




44475 


VI-50 


o g ^ 


585 


356 




VI 5 2 


- 


- 




4447 6 


VI-53 


g 95 


560 


357 


4447 7 


VI-55 


444 


509 


359 


«* 


VI 65 


- 


- 




44478 


VI-70 


10 9 


550 


2 


444 


VI 71 










VI 7 2 








4447 9 


VI-74 


gQ5 


655 


365 


4448 0 


VI-76 


444 


582 


367 


444 


VI 7 9 
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2_ ,c ( -| 


VI 7 9 


- 






4-£4- 


VI 8 4 


- 


- 




4-££81_ 


VI-87 


4^4-4- 


595 


370 


4-4^82 


VI-88 


^4-44 


651 


371 


4-£4 


VI 9 0 


- 


- 




«* 


VI 93 


- 


- 




4-££83_ 


VI-95 




230 


374 




V-i — 9-& 


- 


- 








- 


- 




4-^3-84 


VII-03 


1196 


412 


411 




VI I 0 6 


- 


- 




«i 


VI I 10 


- 


- 




4-^ 


VI I 11 


- 


- 




^85 


VII-15 


4-4-^4 


439 


414 


W486 


VII-19 




580 


171 


4-^8 7 


VII-21 




671 


173 


4-4^ 


VII 2 5 


- 


- 




^8 8 


VII-32 


W4- 


457 


179 


4-4^8 9 


VII-36 


*» 


209 


182 




VII-39 


g "7 g 


541 


183 


^91 


VII-42 




502 


186 


4-&4-92 


VII-43 


^QQ 


316 


187 


4-£4^93 


VII-46 




631 


190 


4-&^94 


VII-47 


4-£44& 


526 


415 


4-&495 


VII-48 


4-2-04- 


613 


416 


4-&^96 


VI 1-5 9 




565 


199 




VII GO 


- 


- 




^■97 


VII-63 




98 


201 


4-£&98 


VII-66 




362 


204 


4-£^ 


VII 6 7 








4-W99 


VII-72 


gQQ 


595 


206 


4-»4-10 0 


VII-73 




522 


207 




VII 7 5 
















209 
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4^101 


VII-76 


,c pi 


624 




444102 


VII-77 


4444 


692 


418 


444103 


VII-80 


g Q 5 


338 


210 


44>£104 


VII-81 


g Q g 


556 


211 


444 


VII 8 3 


- 


- 




444 


VI I 8 6 


- 


- 




444 


VI I 8 8 


- 


- 




444105 


VII-90 


444 




216 


2 0110 6 


VII-91 


444 




217 


44410 7 


VII-93 


444 


379 


219 


2 q 3 


VI 1 1 01 


- 


- 




2 0 4 


VIII 0 2 


- 


- 




2Q5 


VI I I 0 3 


- 


- 




2 0 6 


VI 1 1 0 6 


- 


- 




44410 8 


VIII-09 


444. 


598 


221 


2 Q O 


VI 1 1 10 


- 


- 




2 q o 


VIII 15 


- 


- 




44410 9 


VIII-20 


g 2 Q 


419 


229 




VIII 2 2 


- 


- 




444 


VIII 2 6 


- 






34-3-110 


VIII-28 


444 


511 


235 


■34-4111 


VIII-29 


g q 5 


592 


236 


444112 


VIII-30 


g g g 


572 


237 


444113 


VIII-31 


g g T 


482 


238 


444-114 


VIII-32 


gQ Q 


545 


239 


44-8-115 


VIII-33 


g g g 


624 


240 


44-4 


VIII 3 9 


- 


- 




4414116 


VIII-41 


6 4 s 




245 


444117 


VIII-42 


6 4 6 


■ 


246 


2 2 2 


VI 1 1 4 4 








444118 


VIII-46 


444 


425 


249 


444119 


VIII-48 




251 


251 


2 2 5 


VIII 5 8 
















261 
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2^4120 


VIII-64 


f x ,C 


627 




2447- 


VIII 65 


- 


- 




££3-121 


VIII-66 


g g 5 


345 


262 


44494.22 


VIII-67 


g g g 


252 


263 


23Q 


VIII 7 4 


- 


- 




4444L23 


VIII-76 


444 


691 


270 


2 3 2 


VI 1 1 7 9 


- 


- 








- 


- 






VIII 00 


- 


- 




235 


VIII 9 5 


- 


- 




2 3 g 


VI 1 1 9 7 


- 


- 




237 


VIII 91 


- 


- 




2 3 g 


VIII 92 


- 


- 




239 


VIII 93 


- 


- 




o/[Q 


VIII 95 


- 


- 




£44 


X 0 4 


- 


- 




444124 


X-07 


O Q Q 


641 


328 


444125 


X-15 


44-4 


132 


329 


444126 


X-29 


o 2 1 


370 


331 


2 /[ 5 


X 3 4 


- 


- 




2 /[ g 


X 3 5 


- 


- 




444127 


X-54 


Q 3 J 


603 


334 


444128 


X-56 




71 


335 


444129 


X-68 


1207 


642 


421 


444130 


X-72 


444 


622 


336 


444131 


X-94 


O g Q 


601 


337 


252 


XI 0 7 


- 


- 




^132 


XI-13 


4-444 


620 


423 


444 


XI 5 0 










XI 5 9 










XI-81 


4-444 


374 


426 


444134 


XII-07 


442-4-4 


567 


427 


2 5 o 


XII 17 
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2 s ^ 


XII 2 6 


- 


- 






XII 2 7 


- 


- 






XII 31 


- 


- 




2 2 


XII 3 2 


- 


- 




^135 


XII-35 


1214 


620 


428 


2 6 4 


XII 3 6 


- 


- 






XII 5 2 


- 


- 






XII-59 


1216 


484 


430 


^137 


XIII-19 




5 5 9 


433 


2 go 


XIII 2 9 


- 


- 




£^138 


XIII-52 


£££ 




378 


2 7 q 


XIII 62 


- 


- 






XIII 8 4 


- 


- 




5-7-2-139 


XIII-92 


££££ 


741 


435 


273 


XV 18 


- 


- 




^■7-4140 


XV- 2 2 


- 


- 


388 


275 


XV 2 4 


- 


- 




^141 


XV- 2 5 


1224 




436 




VT7- O O 


- 


- 




273 


XV 3 4 


- 


- 






XV 4 2 


- 


- 






XV 6 8 


- 


- 




2 o ]_ 


XV- 7 4 


- 


- 




2g2 


XV 93 


- 


- 




2g3 


XV 9 4 


- 


- 






XV 9 6 


- 


- 




£^142 


XVI-3 6 


1056 


435 


382 


£££143 


XVI-53 


££££ 


741 


439 


-2-8-7- 


XVI 5 9 








££8-144 


XVI- 6 6 


1074 


689 


384 


£££145 


XVI-7 6 


££££ 


198 


386 


£££146 


XVI-7 7 


1084 


198 


387 


£££ 


XVI I 0 7 
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292 


XVII 0 9 


- 


- 






XVII 17 


- 


- 




^4 


XVII 2 9 


- 


- 




2 g 15 


XVII 2 9 


- 


- 




3-££147 


XVI 1-31 


4-±££ 




392 


£W 


XVII 3 6 


- 


- 




2 9 Q 


XVII 3 9 


- 


- 




^148 


XVI 1-40 


1231 


203 


440 


3 0 014 9 


XVI 1-48 


1149 




393 


3 Q ]_ 


XVII 5 5 


- 


- 




3 q 2 


XVII 5 9 


- 


- 




3 Q 3 


XVI I 6 7 


- 


- 




3 q g 


XVI I 7 2 








^5-150 


XVI 1-7 6 


££££ 


650 


394 


3 Q g 


XVII 9 2 








££3-151 


XVI I -8 7 


1165 


502 


395 




XVII- 95 


1172 


648 


396 
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Table lb 

List of sequences of probes informative for disease diagnosis 
Picacc — &e-e — fetee — note — at the bottom 



Clone ID 


Sequence — J-&SEQ ID 

NO. in Sequence 
Li s t inq 




-E — Q-9- 


■2-943- 


I — 1 0 


2 9 9 6 


1 — 13 


1 O O *1 A A A 

13 314 4 4 


1 — 14 


I 1 "7 o o r> T 

I I / o 3 y / 


1 — 15 




4- ir& 




1 — 17 


3 0 2 8 


1 — 19 


3 0 4 9 


-E — 34+ 





1 — 2 2 


3 0 61 0 


T O Q 


-3-9-7- 


1 — 2 4 


3 0 81 1 


1 — 25 


^0-9-12 


1 — 2 8 




1 — 3 0 


1 i o n o n o 
1 1 o U 3 9 o 


1-31 


3111 4 


-E — 3-2- 


— 


1 — 3 4 


3131 5 


1 — 3 7 


1 /I /I Pi A O O 

1 4 4 U 4 o Z 


1 — 3 8 


3141 6 


1-3 9 


3151 7 


1 — 4 0 


3161 8 


1 — 42 


1 Q Q O /I /I C 


i — 44- 


317 


1 — 45 


441-8- 


i — 4-6- 


319 


1 — 47 


44244 


1 — 4 8 


3 211 9 


1 — 4 9 


o o o o n 
3 z / z U 


1-5 3 


3 2 3 2 1 


1 — 5 4 


11813 9 9 


1 — 5 6 


^j^ 22 


1-57 


44245-2 3 


1 — 5 8 


4424^2 4 


1 


Z±l£^ 


1 — 64 


3 2 8 2 6 


1-67 


■ y 


1-69 


3 312 8 




3 3 2 


4- — 7-2- 


4^43- 


4- — 7-3- 




1 — 77 


ff^ 29 


4- — 7-9- 




1 — 80 


f^ 3 ° 


1 — 81 




1 — 82 


3 3 93 2 


1 — 86 


1 O O C A a n 


1-88 


1182400 


1-95 


1337448 


11-02 




11-03 


^4-3 4 


11-05 




11-06 


3^4-3 6 


11-07 




11-08 




II 09 




11-10 


3^-3 9 


11-11 


^^4 0 


11-12 


447-4441 


11-13 


^7-4-4 2 


II 14 


372 


11-15 


444^4 3 


11-16 


4>7-44 4 
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— |^ 


4i i-& 


3 7 6 


4-E — 3-0- 


3 7 7 


11 — 21 


3 7 84 5 


-t-E — 2-3- 


3 7 9 


11 — 23 


3 8 0 4 6 


11 — 24 


3 814 7 


11 — 25 


O O O A Q 
J O Z 4 O 


T T OK 

± ± — z u 


j o j 4 y 


11 — 27 


3 8 4 5 0 


4i — 34*- 


3 o 5 


11 — 29 


^^" 51 


11 — 30 


f!^ 5 ^ 


11 — 31 





11 — 32 


Jo 95 4 


11 — 33 


^ 9 ' 55 


11 — 34 


3 915 6 


4i — 34^ 









T T Q Q 


3 9 4 5 7 


11 — 39 


3 9 5 5 8 


11 — 40 


3 9 65 9 


11 — 41 


3 9 7 6 0 


11 — 42 




11 — 43 


^t r Z> 


11 — 44 


4 0 0 63 


11 — 46 


4 0164 


11 — 47 


4 0 2 65 


11 — 48 


4 0 3 6 6 




4 0 4 


11 — 50 


4 0 5 6 7 


11 — 52 


4 0 6 6 8 


11 — 53 


4 0 7 6 9 


11 — 54 


4 0 8 7 0 


11 — 55 


4 0 9 7 1 


11 — 56 


410 7 2 


11 — 57 


4117 3 


11 — 58 


412 7 4 


11 — 59 


413 7 5 


11 — 60 


414 7 6 


11 — 61 


415 7 7 


11 — 62 


416 7 8 


1 1 — b J 


417 7 9 


11 — 64 


418 8 0 


11 — 65 


4 ^ Q ] 


II- ° 6 


^TIZ 


11 — 67 


4 218 3 


11 — 68 


A O O O /I 

4 z z o 4 


11 — 69 


4 2 3 8 5 


11 — 70 


4 2 4 8 6 


11 — 71 


4 2 5 8 7 


11 — 72 


4 2 6 8 8 


11 — 73 


4 2 7 8 9 


11 — 74 


/i o o q n 
4 Z o ^ U 


11 — 75 


4 2 9 9 1 


11 — 76 


/i q n q o 
4 J U ^ Z 


11 — 77 


4 3193 


11 — 78 


4 3 2 94 


11 — 79 


4 3 3 95 


11 — 80 


4 3 4 9 6 


11 — 81 


4 3 5 9 7 


11 — 82 


4 3 6 9 8 


T T ° Q 


/i q 3 

i -j / 


11-84 


44^-99 


II 85 


4*4 


II 86 




11-87 




11-88 


44*101 


II 89 


44^ 


II 90 


444 


II 91 


444 


11-92 


444102 


11-93 


444103 
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11 — 94 


A A o i n A 

4 4 8 1 U 4 


-£4! 9-?5- 


4-4-9- 


11 — 96 


4 5 01 0 5 


1 1 1 — 0 1 


4 5 21 0 6 


1 1 1 — 0 2 


4 5 31 0 7 


1 1 1 — 0 3 


4 5 4 1 U o 


III — 9-4- 


4 5 5 


III — 943- 


4 5 7 


1 1 1 — 0 6 


/ICQ"! n Q 

Q jb! U y 


1 1 1 — 0 7 


4 5 91 1 0 


1 1 1 — 0 8 


4 6 01 1 1 


1 1 1 — 0 9 


4 611 1 2 


1 1 1 — 1 1 


4 621 1 3 


1 1 1 — 1 2 


4 631 1 4 


1 1 1 — 1 3 


4 641 1 5 


III — 3r4- 


^ff^ 


III — 3—5- 




III — 




4 6 7 


1 1 1 - I 7 




1 1 1 — 1 8 


4 6 91 1 6 


III — 1-9- 


4 7 0 


1 1 1 — 2 0 


1 1 o J 4 U 1 


1 1 1 - 2 1 


4 711 1 7 


1 1 1 — 2 2 





1 1 1 — 2 3 




1 1 1 — 2 4 


4 / 4 1 Z U 


1 1 1 — 2 5 


4 7 51 2 1 


1 1 1 — 2 6 


4 / b 1 Z z 


1 1 1 — 2 7 


4 7 71 2 3 


1 1 1 — 2 8 


a n Q 1 O /l 

Q / o 1 Z 4 


1 1 1 — 2 9 


4 7 91 2 5 


1 1 1 - 3 1 


4 811 2 6 


1 1 1 — 3 2 


A O O 1 O T 

4 o z 1 z / 


1 1 1 — 3 3 


A Q O 1 O Q 

doJl Z o 


1 1 1 — 3 4 


A Q A 1 O O 

'loii z y 


1 1 1 — 3 5 


A O C 1 Q O 

4 o 5 1 J U 


III — 3-7- 





1 1 1 — 3 9 


4 8 71 3 1 


1 1 1 — 4 0 


A Q Q 1 O O 

lool J Z 


1 1 1 — 4 2 


/I n o i o o 
4 O D 1 J J 


1 1 1 — 4 3 


/i o n c n n 
4 ij U 5 U U 


1 1 1 — 4 4 


4 911 3 4 


1 1 1 - 4 5 


4 921 3 5 


1 1 1 — 4 6 


4 931 3 6 


1 1 1 — 4 7 


l^^oo 


1 1 1 — 4 8 




1 1 1 — 4 9 


i 'j t> i j y 


1 1 1 — 5 0 


/i n t n /in 
(ID / I 4 U 


III — 


4 9 8 


1 1 1 — 5 2 


4 9 91 4 1 


1 1 1 - 5 3 


c n n i a o 
5 0 01 4 2 


III — §-4- 


5 Q ]_ 


1 1 1 — 5 5 


c n o i /i o 
5 U z 1 4 J 


1 1 1 — 5 6 


c n o i /i /i 


1 1 1 — 5 7 


5 0 41 4 5 


1 1 1 — 5 8 


5 0 51 4 6 


1 1 1 — 5 9 




1 1 1 — 6 1 


c n T 1 /l Q 

jU / I 4 o 


1 1 1 - 62 


r n 01 /in 

5 0 81 4 y 


1 1 1 — 63 


5 0 91 5 0 


1 1 1 — 64 


5101 5 1 


III — 6t§- 


511 


T T T _ C C 

111 DO 


31/1 O Z 


111-67 


^«153 


III 69 


^4-4 


111-70 


^4^154 


III 71 




III 73 


-&4-7- 


111-74 


^4r3-155 


111-76 


^4156 


III 77 


5 2 0 


111-78 


^4-157 


III 79 


^22 
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1 1 1 — 8 0 


5 2 3 1 5 8 


1 1 1 — 8 1 


CO /I 1 c O, 

j ^ fi i o y 


1 1 1 — 8 2 


1 3 4 8 4 3 I 


1 1 1 — 8 3 


5 2 51 6 0 


1 1 1 — 8 5 




1 1 1 — 8 6 


5 2 71 62 


III — 




1 1 1 — 8 8 


3 Z y 1 b3 & 1 b4 


1 1 1 — 8 9 


5 3 01 65 


III 94- 


5 31 


1 1 1 — 92 


1 OCT /ICO 

1 J 5 1 4 5 Z 


1 1 1 — 93 


5 3 21 6 6 


1 1 1 — 94 


5 3 3 1 6 7 


1 1 1 — 9 5 


5 3 41 6 8 


III 


5 3 5 


-t^ — 0-3- 




IV— 0 4 


CQOQ T O 


IV— 1 3 




IV— 1 4 


6 8 4 2 7 5 


IV— 1 5 


i o o c a n o 
1 1 o 5 4 U Z 


IV— 1 7 


6 8 5 2 7 6 


IV— 2 3 


13 3 3 4 3 4 


IV— 2 6 


1 i o r /i n o 
118 64 U 3 


IV— 2 8 


"1^72 7 3 


IV— 3 1 




IV— 3 2 


b o o z / y 


IV— 3 5 




IV— 3 7 


<§f£-4 9 7 


IV— 3 8 


r~ q q o q n 
bo o U 


IV— 4 0 


r o, n o q i 
b y U Z o 1 


IV- 4 2 


/*" O 1 o o o 

b y 1 Z 8 z 


IV— 4 3 


1 O O O /I A 0 

1 z a J 4 4 1 


IV— 4 4 


r~ Q O O Q o 
b y Z Z o J 


IV— 4 7 


r o o r > /i 
b y J Z o 4 


IV— 5 3 


-64-4 9 8 


IV— 5 5 


r o /i o q c 
b y 4 z o 5 


-iV — 




IV— 6 1 


r a r o <~> c: 
b y bZ o b 


IV— 6 4 


r c\ t o o i 

by / z 8 / 


IV— 6 5 


b y o z o o 


IV— 6 9 


192 4 


IV— 7 2 


O O O Q O 

b y y z o y 


IV— 7 3 


o n o o on 

/ u u z y U 


IV— 8 0 


t n 1 o o i 

/ u i z y 1 


-tV- — 8-2- 


"^jogo 


IV— 8 5 




IV— 9 3 


t n o o o o 

/ u j z y 3 


TV— 9 5 


1 Pi A O O A 

i u d z y 4 


IV— 9 6 


t ri c o o c 
/ U 5 z y 5 


IX— 1 0 


/ JQJ 1 4 


i-X- — i-2- 


^3-8- 


IX— 1 3 


7 3 93 1 5 


IX— 2 4 


7 4 7 3 1 6 


IX— 3 8 


7 5 7 3 1 7 


IX— 3 9 


■"7 C Q O 1 O 


IX— 4 8 


7 6 4 3 1 9 


IX— 5 0 


7 6 63 2 0 


IX— 5 6 


"7 r o o o i 
/boo Z 1 


IX— 6 2 


T T o o o o 
/ 7 3 3 Z Z 


IX— 6 5 


T "7 O O Q 

/ / bJ z 3 


IX— 7 2 


T O O O O /I 

/ o Z 3 z 4 


IX— 7 7 


7 S 5 3 2 5 


TY-Q1 


T Q £ O O (C 


IX-96 


£^43 2 7 


V-01 


1361458 


V-0 3 




V-0 4 




V-0 7 


W8-2 9 8 


V-08 


3-QS-2 9 9 


V-09 


^300 


V-ll 


11884 0 4 


Vl-16 


£4^3 4 4 


Vl-19 


£^3 4 5 
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V— 1 2 


7113 U 1 


V- 1 7 


1 O C/l /l c o 


V — irS- 


712 


V— 2 0 


/ 1 J J U Z 


V— 2 4 


/Ido U o 


V-2 5 


1 J bo 4 b U 


V— 2 8 


1 1 o 9 4 U o 


V— 35 


13 6 64 6 1 


V — 3-7- 


7 1 6 


V— 3 8 


1 1 9 U 4 U b 


V— 3 9 


1 1 U 9 o o 9 


V — 4 0 


717 3 0 4 


V— 4 1 


718 3 0 5 


V— 4 7 


1 Q CT O A C Q 


V— 4 8 


/ 1 if 3 U b 


V— 4 9 


13 6 94 64 


V— 5 5 


-7-7-4 9 9 


V— 5 7 


-7 q n n t 
/ Z U o U / 


V— 5 8 


1 o / U 4 b o 


V— 6 1 


"7 O 1 o n o 

/ 2 1 J U o 


V— 64 


n o o q n Q 




7 2 3 


V — b o 


1 /l /l O /I O /I 


V— 7 1 


T^p 4 n~ 


V— 7 4 





V— 7 5 


13 7 2 4 6 7 


V — o U 


7 2 63 1 1 


V— 8 1 


t iL t O 1 Z 


V— 8 7 


/ZOO 1 O 


V— 9 0 


1 Q T /I /I iC O 


V-i — 0-2- 




V-i Qt3- 


3 41 




-3-4-2- 







V-i — 0-7- 


3 4 4 


V-i — 0-8- 




V- E 


-3-4-6- 


V-i Hr 


3 4 7 


VI — 1 2 


8 6 93 4 1 


VI — 1 3 


o t n •"' /i o 
o / U O 4 Z 


VI — 1 4 


Q71 Q A O 
O / 1 O 4 O 


VI — 1 6 


o n Q q /i /i 
o / o o 4 4 




3 >j Q 


V-i — i-9- 


3 4 9 


V-i — 2-0- 




V-i — 2-i- 









VI — 2 3 


O "7 O O /I "7 

o / o o 4 / 


VI — 2 4 


OT QO /I O 

o / 9 o 4 o 







V-i — 2-6- 




V-i — 2— 7- 





V-i — 3-t- 


^T^, c 


VI — 3 2 


•j^ -351 




3 5 7 


V-i — 3^5- 


tt^ 352 


VI — 3 9 




VI — 4 3 


1 Q Q O /I "7 1 


VI — 4 4 


119 3 4 U 9 


VI — 4 5 


o o n o c o 


V-i — 4-8 




VI — 4 9 


o iiz o U 1 


vi-ju 


Q O Q Q c: C 

oijjj o b 


VI-53 


8-9^3 5 7 


VI-55 


^^3 5 9 


VI-58 




VI-66 




VI-67 


■9^3 64 


VI-70 


4-£-&2 


VI-71 


1387472 


VI-74 


#0^3 65 


VI-75 


-9^3 6 6 


VI-76 


-9^3 6 7 
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VI — 77 


i i n o 
1 1 U J 


VI — 79 




VI — 8 0 


o n Q O (Z Q 


VI — 8 5 


o i n o c. Q 
'J 1 U J b y 


VI — 8 7 


jIIj / U 


VI — 8 8 


'J 1 A J / 1 


VI — 9 0 


1 O (1 Pi /I 1 A 


VI — 9 3 


1 O (11 A n C 

1 J 9 1 4 / o 


VI — 9 5 


9153 7 4 


VI — 9 6 


1 O (1 O /I T iC 


VI I — 9-2- 


5 4 7 


VI I — 9-3- 




VI I — 9-4- 




5 4 9 







VI I — Q-& 


5 51 


VI I — Q-T- 





VI I — 9-8- 


|^| 


VI I — 9-9- 




VI I — i-9- 


^-5-5- 




"^"^ 


VI I irQr 


—5^5 


VI I 3r4- 


5 5 8 


VI I i-r)- 


5 5 9 


VI 1 — 17 




VI I — 1 8 


^""T 1 


VI 1 — 19 


5 621 7 1 


VI 1 — 20 


5 631 7 2 


VI 1 — 21 


5 6 41 7 3 


VI 1 — 2 2 


5 651 7 4 


VI 1 — 23 


5 6 61 7 5 


VI 1 — 24 


5 6 71 7 6 


VI 1 — 2 5 


1 Q O T A O Pi 

13 9/4 o U 


VI 1 — 2 6 


"^^" 5 


VI 1 — 27 


5 6 81 7 7 


VI I — 2-8- 




VI 1 — 2 9 


5 / U 1 / o 


VI 1 — 3 2 


5 711 7 9 


VI 1 — 33 


c t o i on 
3 / it 1 o U 


VI I — 3-4" 


5 7 3 


VI 1 — 3 5 


5 7 41 8 1 


VI 1 — 36 


5 7 51 8 2 


VI 1 — 3 9 


5 7 61 8 3 


VI 1 — 40 


5 7 71 8 4 


VI 1 — 41 


5 7 81 8 5 


VI 1 — 42 


"^~^ 186 


VI 1 — 43 


^ n 


VI 1 — 44 


5 811 8 8 


VI 1 — 4 5 


c Q q i q O 


VI 1—4 6 


jo Jl y U 


VI 1—4 7 


i o n n /i i c 
1 Z U U 4 1 O 


VI I — 4-8- 




VI 1 — 4 9 


5 8 51 9 1 


VI 1 — 5 0 


5 8 61 92 


VI 1 — 5 2 


5 8 71 93 


VI 1 — 53 


C Q Q 1 fi /I 

oooi y 4 


VI 1 — 54 


5 8 91 9 5 


VI 1 — 55 


c n n i r> c 
5 9 U I y b 


VI 1 — 57 


5 911 9 7 


VI 1 — 5 8 


c c\ o i no 
5 9 21 9 8 


VI 1 — 5 9 


c D O 1 DO 

5 9 j i y y 


VI 1—62 


c n a o n n 
5 9 4 2 U U 


VI 1—63 


5 95 2 0 1 


V _L _L — O 4 


S a g 9 n o 

o y b z u z 


VII-65 


W^2 0 3 


VII-66 


W&2 0 4 


VII-67 


13 99481 


VII-71 




VII-72 




VII-73 


■^0^2 0 7 


VII-74 


^03-2 0 8 


VII-76 


^0^209 


VI I 7 7 


(C Q A L 


VII-80 


£0^210 
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VI 1 — 81 


6 0 62 1 1 


VI 1 — 82 


en to i o 
b U / Z 1 Z 






VI 1 — 84 


r n o o 1 Q 


VI 1 — 86 


n a a q a o i 
1 4 5 3 4 o / 


VI 1 — 87 


610 2 1 4 


VI 1 — 89 


6 112 1 5 


VI 1 — 90 


612 2 1 6 


VI 1 — 91 


6132 1 7 


VI 1 — 92 


614 2 1 8 


VI 1 — 93 


6 15 2 1 9 


VI I — 9-4- 




VI 1 — 96 


b 1 / 2 z U 


VI 11 — 09 


618 2 2 1 


VI 11 — 10 


c i no o o 
bl ijz z z 


VI I I 3r4r 


g o Q 


VI 11 — 12 


b 2 12 2 o 


VI 11—13 


,c 2 2 2 2 4 


VI I I 




VI 11 — 16 


62 4 2 2 5 


VI 11 — 17 


62 5 2 2 6 


VI 11 — 18 


r o r o o T 
bz bz z / 


VI 11 — 19 


C O "7 O O O 

bz / 2 2 8 


VI 11 — 20 


"^^ 229 


VI 11 — 21 


-6^2-9-2 oO 


VI 1 1 — 2-2- 


14 5 5 


VI 11—23 


63 0 2 3 1 


VI 11—24 


b J 1 Z J Z 


VI 1 1 — 25 


b J z 2 J J 


VI 11—26 


1 /l c: C A Q O 


VI 11—2 7 


63 3 2 3 4 


VI 11—28 


b J 4 2 J 5 


VI 11—29 


63 5 2 3 6 


VI 11 — 30 


f z, t -~ o 2 "7 


VI 11—31 


b J / Z J o 


VI 11—32 


b J o 2 j y 


VI I 1—33 


b J 9 z 4 U 







VI 11—36 


6 412 4 1 


VI 11 — 37 


b 4 Z 2 4 2 


VI 11 — 38 


b 4 J 2 4 J 


VI 11 — 40 


b'l'iZ 4 4 


VI 11 — 41 


6 4 5 2 4 5 


VI 11 — 42 


6 4 62 4 6 


VI 11—43 





VI 11 — 45 




VI 11—46 


b '1 it Z 4 y 


VI 11-47 


65 0 2 5 0 


VI 11 — 48 


6512 5 1 


VI 11 — 50 


65 2 2 5 2 


VI 11 — 51 


^^" 253 


VI 11 — 53 


65 4 2 5 4 


VI 1 1 — 54 


^^ 255 


VI 11 — 55 


65 62 5 6 


VI 11—56 


65 7 2 5 7 


VI 11 — 57 


65 9 2 5 8 


VI 1 1 — 5-8- 




VI 11 — 59 


r r n o ^ q 
b b u z o y 


VI 11 — 60 


6 612 6 0 


VI I I 64r 


"k^ 3 " 


VI 11—64 


6 63 2 6 1 


VI 1 1 — 6-§- 




UTTT-KK 
Vlll D O 


ic g c 9 (C o 
b b -J z b z 


VIII-67 


^^2 63 


VIII 6 8 




VI 1 1 6 9 




VI 1 1 -7 0 




VIII-71 


^W2 6 5 


VIII-72 


^4-2 6 6 


VIII-73 


^7-^-2 6 7 


VIII-74 


^S-2 6 8 


VIII-75 


^^2 6 9 


VIII-76 


^^2 7 0 
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VI 11 — 77 


6 7 6 2 7 1 


VI 1 1 — 48- 


6 7 7 




g "7 Q 


VI 11 — 80 


b / 3 2 / 2 


X— 0 7 


O Pi Q O O Q 

o U o o 2 o 


X— 1 5 


on /i o o Q 


X - 2 ( j 


on t o o n 
ol / o o U 


X— 2 9 


Q O 1 QQ1 

o 2 1 3 J 1 


X— 3 4 


S 2 5 3 3 2 


X — 4 6 


S 3 3 3 3 3 


X— 5 4 


O O "7 O O /l 

o 3 / o o 4 


X— 5 6 


S 3 9 3 3 5 


X— 6 8 


i o n T A O 1 

1 2 U / 4 2 I 


X— 7 2 


8 4 9 3 3 6 


X— 7 3 


i o n o /i o o 
1 Z U o 4 Z Z 


X— 94 


5 6 0 3 ^ 7 


XI — 1 3 


1 2 U J 4 Z o 


XI — 3 7 


1 A (Z Pi /l O Pi 

1 4 b U 4 y U 


XI — 4 3 


i o i n/io/i 


XI — 6 7 


1 O 1 1 /IOC 


XI — 8 1 


12 12 4 2 b 


XI I — 0 7 


1 Z 1 3 4 Z / 


XI I — 3 5 


1 O 1 /l /l o o 


XI 1 — 3 6 


1215 4 2 9 


XI I — 5 9 


12164 3 0 


XI 1 — 65 


i n o o o o i 
1 U d e J o 1 


XI I — 92 


1217 4 3 1 


XI 1 1 — 0 3 


D 1 "7 O "7 C 

y i / 3 / o 


XI 1 1 — 0 4 


1 O 1 o /i o o 
1 Z 1 o 4 J Z 


XI I I — 1 9 


1 Ol Oi A Q Q 


XI 1 1 — 2 4 


O O iC O "7 C 

y Z b J / b 


XI 11—51 


O Q O O "7 "7 

y 3 o 3 / / 


XI 11—52 


QQQQ7Q 

y 3 y 3 / o 


XI 1 1 — 6 7 


O A n O "7 o 

y 4 / 3 / y 


XI 1 1 — 6 9 


o /i q o o n 

y q y 3 o u 


XI 1 1 — 8 8 


i o o n /i q /i 
1 2 2 U 4 3 4 


XI 11 — 92 


1 O O 1 /IOC 

12 2 14 3 5 


XV — 2 2 


i u y y 3 o o 






XV— 2 5 


i o o /i /i o r~ 
12 2 4 4 3 b 


XV— 4 2 


110 8 


XV— 6 2 


1 O O f~ /l O "7 

1 Z Z b4 3 / 


XV— 6 4 


1 n i o o n o 
1 1 1 8 J 9 0 


XV— 8 4 


i i o c o r> i 

11233 y 1 


XV 1—19 


X^ 3 ^ 


XVI — 3 6 


10 5 63 8 2 


XVI — 5 3 


12 3 0 4 5 9 


XV 1—60 


i n t i o o o 


XVI- 6 6 


1074384 


XVI-74 


10 813 8 5 


XVI-7 6 


1083386 


XVI- 7 7 


1084387 


XVII-31 


4-44^3 92 


XVI 1-4 0 


1231440 


XVII-48 


1148393 


XVI 1-7 6 


1160394 


XVI I -8 7 


1165395 


XVI I -95 


1172396 



Sequence □ — not — available — for — ocqucncc — 44s — 4e — Table — 4; — and — c orroop o nding — go quo nee — Ida 
in Table 2 and 4. 

2 gg , 3 01, 3 0 5, 3 0 7, 312 , 317, 3 1 S , 319, 320, 332 , 333 , 334, 33 6, 340, 341, 342, 343, 

344, — 3445-; — 3-4^7 — 3-4-44 — 3-4-8-; — 3-4-9-; — 34>44 — 3434^ — 3442-; — 3443-y — 3444-; — 3445-; — 3444; — 3447-; — 444-; — 3449-; — 367, 
313r, — 3444; — 3444; — 3444 — 3-^r — 3-8^7 — 3-944; — 3-9434 — 44-4-; — 44344 — 443-9-; — 44-44 — 4-443-; — 444-; — 4-44-; — 4444; — 444-r 
4444 — — ^r&- f — 4444 — — ^7 — 4444; — 498, 501, 511, 514, 516, 517, 520, 522, 528, 531, 
535. 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 573, 584, — 444-r 
608, 616, 620, 623, 640, 659, 662, 664, 667, 668, 673, 677, 678, 679, 681, 695, 702, 
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3-±3r-, — i-y^-, — — &£-&r — §-9^7 — m^r- f — m$- t — — 1101, — nos, — 1109, — 1177, — 1197, — 1193, — 1204, 

1220, — 1239, — 1255, — 1256, — 1342, — 1347, — 1354, — 1357, — 1362, — 1363, — 1364, — 1373, — 1375, — 1379, 
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Table 2a 

List of informative probes for diagnosis of breast cancer 



Clone 
ID 



1-24 



1-28 



1-30 



4— ^5- 



IV- 5 3 



IV 62 



IV- 6 9 



IV- 8 0 



4-V- 



IX-10 



ix u 



IX-38 



IX 3 9 



IX 4 2 



IX-77 



V-ll 



V 3 2 



V 39 



V-55 



V-80 



V 9 4 



VI-07 



VI 41 



VI-48 



VI 4 9 



¥4- 



VI 6 5 



VI-70 



Sequence 

^ SEQ ID 
NO. in 

Sequence 
Li sting 



5-0-8-11 



^4-^13 



4-4-££398 



1181 399 



44-498 



4-4^-4 



44^4-2 91 



4-#£ 



454>314 



^317 



n ^ Q 



^319 



1188 404 



4*-7-4 9^ 



444-43 11 



4494-3 5 5 



-8-9-3-3 5 9 



^©-8-2 



Clone ID 



VJ 



VIII-48 



VI 11-66 



VIII 7 4 



VIII-76 



-44 



X-07 



X-15 



X-29 



X-54 



X-5 6 



X-68 



X-72 



X-94 



XI 0 7 



XI-13 



XI 5 0 



XI-81 



XII-07 



XII 17 



XII 2 6 



XII 2 7 



XII 3 2 



XII-35 



Sequence 

^ SEQ ID 
NO. in 

Sequence 
Li sting 



44*4-2 51 



4^42 62 



4+4>2 7 0 



4444328 



44443 2 9 



4444-331 



4444^335 



1207 421 



4449-336 



44443 3 7 



^44444423 



1213 427 



1214 428 
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Clone ID 


Sequence ID 


XII — 3-6- 




XII — 




XI I — 5 9 


i on c a o n 
1 iL 1 b 4 J U 


XI 1 1 — 1 9 


1 A 1 : j 4 J j 


XIII — 2-9- 




XI 1 1 — 5 2 


9 J 9 J / 0 






XIII — 8-4- 




XI 1 1 — 9 2 


1 O O 1 /l 0 c 


X-V- — i-8- 




XV— 2 2 


1 n n n o 0 0 
1 0 9 9 0 0 0 


X-V- — 2-4- 




XV— 2 5 


1 T O /I /I O C 










XV- 4 2 








X-V- — t*-4- 




X-V- — 9-3- 




XV— 9 ''I 




X-V- — 9-6- 




XVI — 3 6 


10 5 63 8 2 


XVI — 5 3 








XVI — 66 


I U / (Id 0 4 


XVI - 7 6 


1 n 0 0 q 0 c 
1 U 0 3 3 0 6 


XVI — 77 


1 n 0 a 0 0 "7 


av 1 1 — y-A 












X-V I 1 T^rPr 




vi rT t o o, 
A V 1 1 — zi-y- 




XVI 1—31 


1 1 0 a 0 a n 1 
113 93 9 2 






XVII 3 9 




XVI 1-4 0 


1231440 


XVII-48 


1148393 


XVII 5 5 




XVII 5 8 




XVII 6 7 




XVII 7 2 




XVI 1-7 6 


11603 94 | 


XVII 8 2 




XVI 1-8 7 


1165395 


XVII-95 


1172396 | 
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List of sequences of probes informative for breast cancer 

PIgqoo — see — the — note — eefc — the — bottom of — Table — ir-. Some — GGquonoo!] — arc mi Going. 



Clone ID 


Sequence IDSEQ 
ID NO. in 

Sequence Listing 


1-13 


1331444 


1-14 


1178397 


1-24 




1-25 


30-9-12 


1-28 


^4-^13 


1-3 0 


1180398 


1-37 


1440482 


1-42 


1332445 


1-48 


^4-19 


1-54 


1181399 


1-60 




1-72 


1335446 


1-81 




1-82 


^4^3 2 


1-86 


1336447 


1-88 


1182400 


1-95 


1337448 


11-02 


^££3 3 


11-03 


^£4-3 4 


11-06 


3-^43 6 


11-07 




11-10 


^£&3 9 


11-21 


^7-8-4 5 


11-23 




11-24 


^£4-4 7 


11-25 


^3-4 8 


11-27 


^3-45 0 


11-33 




11-34 


^4-5 6 


11-41 


^7-60 


11-42 


^££61 


11-46 


404^64 


11-47 


4-^£449 


11-48 


4^-66 
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11-52 


44468 


11-57 


44-4-73 


11-58 


44-2-74 




'llJ / O 


11-60 


44-47 6 


11-61 


4447 7 


11-62 


44-47 8 


11-64 


44480 


11-67 


42-4-8 3 


11-69 


42-485 


11-70 


4448 6 


11-74 


444 j)0 


11-80 


44496 


11-82 


44498 


11-84 


44499 


11-87 


444-100 


11-88 


442-101 


11-96 


444105 


111-01 


444106 


111-02 


44410 7 


111-06 


44410 9 


111-08 


444111 


111-12 


444114 


111-13 


4-44115 


111-17 


1344450 


111-18 


444116 


111-20 


1183401 


111-21 


444117 


111-23 


444119 


111-24 


444120 


111-25 


444121 


111-26 


444122 


111-27 


444123 


I I 1-2 8 


444124 


111-29 


444125 


111-32 


444127 


111-33 


444128 


111-35 


444130 


111-39 


444131 


111-40 


44413 2 


111-42 


444133 


111-45 


444135 


111-46 


444136 


111-47 


444-137 


111-48 


444138 


111-56 


444144 


111-57 


444145 
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111-58 


^-©-§-14 6 


111-59 


^££147 


111-61 


^£■^148 


I I I- 62 


^^■149 


111-63 


^4150 


111-64 


^9-151 


111-66 


^4-2-152 


1 1 1 - 6 7 


54^153 


111-70 


^4-^154 


111-74 


^4-&155 


111-75 


^4-9-156 


111-78 


^g-4-157 


111-80 


^-2-^158 


111-81 


^■2-4159 


111-82 


1348451 


111-85 


^^161 


111-86 


^g-^162 


111-88 


^2-9-163 + 164 


111-89 


^^165 


111-92 


13 514 5 2 


111-93 


^2-16 6 


111-95 


^3-4168 


111-96 


1352452 


IV- 04 


£^2-273 


IV- 13 


£#^■274 


IV- 14 


££-4275 


IV- 15 


118 5 4 0 2 


IV- 17 


■S&5-2 7 6 


IV- 2 3 


1353454 


IV- 2 6 


1186403 


IV-31 


££■42 7 8 


IV- 3 2 


£££2 7 9 


IV- 3 5 


1355455 


IV- 3 7 


G£4 9 7 


IV-38 


££^2 8 0 


IV- 4 2 


£-942 8 2 


IV- 4 3 


1239441 


IV- 4 7 


£-9£284 


IV- 5 3 


£4498 


IV- 61 


£^£2 8 6 


IV- 6 4 


£^42 8 7 


IV- 6 9 


4-94M 


IV- 7 2 


£»92 8 9 


IV- 8 0 


4£4-2 91 


IV 8 2 


4t££ 


IV- 8 5 


4£42 92 


IV- 93 


1360457 


IV- 9 6 


4£42 95 
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IX-10 


44^314 


IX 12 


n 2 o 


IX-13 


44i4»315 


IX-24 


4^316 


IX-38 


4444^317 


IX-39 


4^448-3 18 


IX-48 


^4319 


IX- 5 0 


4^-4443 2 0 


IX-56 


^321 


IX-62 


4^4^322 


IX-65 


4^3 2 3 


IX- 7 2 


^324 


IX-77 


^3 2 5 


IX-91 


4^326 


IX- 9 6 


44444-327 


V-01 


1361458 


V-03 


444^296 


V-0 4 


4U44Z-2 9 7 


V-0 7 


W^-2 9 8 


V-0 8 


4*4442 9 9 


V-ll 


1188404 


V-12 


^4-4-3 01 


V-17 


1364459 


V-24 


4^4-43 0 3 


V-25 


4^4>460 


V-28 


1189405 


V-38 


1366461 


V-3 8 


1190406 


V-39 


4-4-4443 8 9 


V-41 


44^443 0 5 


V-47 


4^4444 63 


V-4 9 


1369464 


V-55 


4^-4-4 9 9 


V-57 


4*4443 0 7 


V-58 


1370465 


V-61 


4*444-3 0 8 


V-64 


442-2-3 0 9 


V-65 


1371466 


V-6 8 


1448484 


V-71 


1495496 


V-7 4 


4*444310 


V-7 5 


1372467 


V-8 0 


4*44£311 


V-9 0 


1374468 


VI-03 


8 6 4 3 3 8 


VI- 0 4 




VI-07 


4441 


VI-08 


44£4*-340 
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VI-09 


1378469 


VI-12 


&£#341 


VI-13 


£W342 


VI-14 


^4-3 4 3 


VI-16 


£^»3 4 4 


VI-19 


■&#3 4 5 


VI-20 


£■^■£346 


VI-21 


1380470 


VI-23 


^347 


VI-24 


&^#3 4 8 


VI-25 


1192408 


VI-26 


££4-3 4 9 


VI-32 


^351 


VI-39 


^^■3 5 2 


VI-43 


1382471 


VI-44 


1193409 


VI-45 


££#353 


VI-48 


£#4-3 5 5 


VI -4 9 


£##5 01 


VI-50 


£#£3 5 6 


VI-53 


£#^3 5 7 


VI-55 


£##3 5 9 


VI-58 




VI-66 


##£363 


VI-67 


##4-364 


VI-70 


i#£2 


VI-71 


1387472 


VI-74 


###365 


VI-75 


###3 6 6 


VI-76 


###367 


VI-77 


4-4-#3 


VI-79 


1389473 


VI-80 


#0*368 


VI-85 


###3 6 9 


VI-87 


#4-4-320 


VI-88 


#4-5-3 71 


VI-90 


1390474 


VI-93 


1391475 


VI-95 


#4-#3 7 4 


VI-96 


1392476 


VII-02 


1195410 


VII-03 


1196411 


VII-06 


1394477 


VII-08 


1197412 


VII-09 


1198413 


VII-10 


1395478 


VII-11 


1396479 


VII-15 


1199414 
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VII-17 


^0169 


VII-19 


5-S2171 


VII-21 


^4173 


VII-22 


^^174 


VII-23 


^££175 


VII-24 


^£■3-17 6 


VII-25 


1397480 


VI 1-2 6 




VII-27 


^8-177 


VII-29 


^W17 8 


VII-32 


^4-17 9 


VI 1-3 3 


^180 


VII-36 


^182 


VII-39 


^■£183 


VII-41 




VII-42 


^4186 


VII-43 


^■8^18 7 


VII-46 


^^190 


VII-47 


12 0 0 415 


VII-48 


12 01416 


VII-49 


£#£■191 


VII-54 


^■8-9-195 


VII-57 


^4-197 


VI 1-5 8 


^^198 


VII-59 


^g-199 


VII-62 


^■£42 0 0 


VII-63 


1202417 


VII-64 


^#£202 


VII-66 


^9^-2 0 4 


VII-67 


1399481 


VII-72 


^206 


VII-73 


££4-2 0 7 


VII-77 


1203418 


VII-80 


£££210 


VII-82 


£££212 


VII-86 


14 5 3 4 8 7 


VII-87 


£££214 


VII-90 


£££216 


VII-91 


£j-£217 


VII-92 


££4218 


VII-93 


£££219 


VII-96 


£££220 


VIII-09 


£^£2 21 


VIII-10 


£££2 2 2 


VIII-13 


£££224 


VIII-16 




VIII-20 


£££2 2 9 


VIII-21 


£££230 
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VIII 2 2 


14 5 5 


VIII-23 


£££231 


VIII-24 


£^232 


VIII-25 


£££233 


VIII-26 


1456489 


VIII-27 


£^■2 3 4 


VIII-28 


£342 3 5 


VI 1 1 -2 9 


£££236 


VIII-30 


£££237 


VIII-31 


£^■238 


VIII-32 


£^£■239 


VI 1 1-3 3 


£££2 4 0 


VIII-34 


1201419 


VIII-38 


£43243 


VIII-40 


£442 4 4 


VIII-41 


£4£2 4 5 


VIII-46 


£4£249 


VIII-48 


£34^2 51 


VIII-55 


£££2 5 6 


VIII-57 


£££2 5 8 


VIII-59 


£££^259 


VIII-60 


££4^2 6 0 


VIII-61 


1205420 


VIII-64 


£££261 


VIII-66 


£££■262 


VIII-73 


££3267 


VIII-74 


£££268 


VIII-76 


£££2 7 0 


VIII-80 


£££■272 


X-07 


£££328 


X-15 


££4329 


X-20 


£££3 3 0 


X-29 


££4,3 31 


X-34 


£££332 


X-46 


£££333 


X-54 


£££3 3 4 


X-56 


£££335 


X-68 


1207421 


X-72 


£4£336 


X-73 


1208422 


X-94 


£££3 3 7 


XI-13 


££££423 


XI-37 


1460490 


XI-43 


1210424 


XI-67 


1211425 


XI-81 


1212426 


XI 1-0 7 


1213427 


XII-35 


1214428 
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XII-36 


1215 4 2 9 


XII-59 


±-£^£430 


XII-65 


^^8-3 81 


XII-92 


1217431 


XIII-03 


#W375 


XIII-04 


1218432 


XIII-19 


1219433 


XIII-24 


m376 


XIII-51 


#^£3 7 7 


XIII-52 


■9^3 7 8 


XIII-67 


■»4-£3 7 9 


XIII- 6 9 


^4-^3 8 0 


XIII-88 


1220434 


XIII-92 


1221435 


XV- 2 2 


1099388 


XV 2 4 


1101 


XV- 2 5 


12224436 


XV 4 2 




XV- 6 2 


1226437 


XV- 6 4 


1118 3 9 0 


XV- 8 4 


1125391 


XVI- 19 


1228438 


XVI-3 6 


1056382 


XVI-53 


^^439 


XVI- 60 


^^383 


XVI- 6 6 


1074384 


XVI-74 


1081385 


XVI-76 




XVI-77 


1084387 


XVI 1-31 


^^9-3 92 


XVI 1-4 0 


12 314 4 0 


XVII-48 


114 83 93 


XVI 1-7 6 


1160394 


XVI I -8 7 


1165395 


XVI I- 95 


1172396 
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Table 3 

List of informative probes (Clone ID) selected for breast cancer diagnosis based on 
their occurrence criterion during variable selection 



Occurrence* 


Clone ID 


100% 


XI 8, XVI-66, VIII-66, XVI 5 9, VII-03 , XIII-1 9, XII-35 , X 35, XI 50, XII 

IV-5 3,XIII 29, XIII 62,1-30, 1 1 1 - 0 6 , XV- 2 2 , XV 9 4 , VI 1-15, VI I - 
3 9, IX-3 9,XVII 3 9, I I 1-4 0, VI 1-3 2 


90% 


I 5 2, VI 65, VI 3 4, IV 62, XV 34, XVII 5 8, V- 11, VI 7 8, XII 3 6, XI II- 
92, VI I I -2 9, XVI -5 3, XVI -7 7, XI -13, XI I I 84, IV- 14, XII 31,V-8 0, VII- 
48, XVII 2 9, XVII 7 2 


80% 


III 60, VIII 74, IX 12, X 04, XIII-52, VIII-30, IX-3 8 


70% 


VI 49, X-29, VIII-48 


60% 


IV 82,IX-10,VI 52, X-68, VII-77 


50% 


IV- 15 


4 0 % 


XV 28, 11-7 0, V-55 






20% 


XI 58, XVI-36, VIII 39, VIII 44,111-61, IV- 6 9, XV 6 8,X-72 


10% 


IX 4 2, IX- 7 7, X- 94, XV 9 6 , XVI I 5 5 


5% 


XI I -5 9 , XVI - 7 6 , I -5 4 , XV 18,V 9 4, X- 54, VI -07, VI 1-47, XVI 1-31, XVI I - 
87, XVII-48 


In at least one model 


11-41, VI 41,111-57, I I 1-8 9, VI 1-7 3, XV-2 5, IV-2 6,X 3 4 , IV 41, VII- 

9 0, XV 4 2, XVII 8 2, XII 2 7, VI I I -2 0, 1-28, VII 60,VIII-76, I I 1-2 0, 

8 4, XI 0 7, XVII 2 8, XII 17, XVII 3 6, XII 5 2, XVI 1-76, VI II -4 6, VI -70, ^V— 

7 4, XV 93, VI I I -31, II-87,V 3 9 , VI -5 5 , X- 0 7 , X- 1 5 , XI I - 0 7 , XVI I 07, XVII 

■0-&7-XVII-95, 1-24, IV-32,V 32,VI-48,VI 72,IV-80, IX-4 8 , X-5 6 , 24V— 

2 4, XII 3 2, XVI I -4 0 



*100% = Genes appearing in all the 75 cross validated models; 90% = Additional genes 
appearing in at least 68 out of 75 cross validated models; 5% = Additional genes 
appearing in at least 4 out of 75 cross validated models and so on. 
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Table 4a 

List of informative probes for diagnosis of Alzheimer disease 



Clone 






Clone ID 




Gcqucncc 




Gcqucncc 


ID 


WSEQ ID 






443SEQ ID 




NO. in 






NO. in 




Sequence 






Sequence 




Li s tincr 






Li s tinq 




— 




III 60 


- 


I 02 


- 




111-63 


444315 0 


I 13 






III 68 






- 






- 


I 21 


- 




111-74 


443-155 


1-34 


34-3-15 




111-80 


4-441158 


I 37 


- 




III 82 


- 


4-43- 


- 




111-85 


43-4161 


1-58 


33-42 4 




III 92 


- 


I 71 


- 




III 96 


- 


I 72 


- 




IV 2 3 


- 


I 96 


— 




IV 2 6 


— 


I 95 


— 




IV 2 9 




11-03 


3 613 4 




IV- 3 1 


34442 7 8 


il U J 


3 g 3 ? 5 








11-06 


3 6 4 3 6 




I T7 Q 5 




11-10 


44-8-3 9 




IV 4 5 




11-24 


3-§4r4 7 




IV- 8 0 


44342 91 


11-25 


44434 8 




IV 8 2 


_ 


11-26 


43^4 9 




IV 93 


- 


11-33 


4^3-5 5 




TI Q ^ 


- 


11-34 


4^5 6 






- 


11-42 


3~£«-61 




V-0 3 


4442_96 


II 4 7 


— 




V-04 


44342 9 7 j 


11-57 


4^4-7 3 




y n c 


— 


11-61 


4445-7 7 




V-0 7 


44341298 


11-69 


4-2-3-8 5 




V-12 ; 


4443 01 


11-75 


434191 




V 15 


— 


II 8 3 


— 




V 17 


— 


11-84 


4343-99 




V-2 1 ; 


— 


11-88 


44*101 




V 25 


— 


II 90 


— 




V 35 


— 


11-94 


4441104 




V 42 


— 


111-02 


4443-10 7 




v 43 : 


— 


III 0 5 


— 




V 47 


— 


111-06 


4441-1 0 9 




V 49 


— 


111-08 


4443-111 




V-5 2 


— 


III 10 


— 




V 54 


— 


111-13 


4434115 




V 58 




III 15 






V 59 




III 17 






y ^ 




111-23 


4443-119 




y f O 




111-26 


444122 




V 71 




111-35 


4*5-1 3 0 




V 75 




111-39 


43^131 




V 79 




111-43 


4-^435 0 0 




V-80 


43-4311 


111-44 


4-^4-13 4 




V 90 




111-53 


4^4314 2 




V 91 




111-56 


4^3-14 4 




V 9 2 





Clone 


Sequence 


ID 


*©SEQ ID 




NO. in 




Sequence 




Li s tinq 


VI 0 2 





VI-04 


865339 


VI 0 9 





VI 10 


_ 


VI-12 


3^3 41 


VI-14 


£-^3 4 3 


VI 17 


_ 


VI-20 


£■^346 


VI 21 




VI-23 


^§■3 4 7 


VI 41 





VI 4 2 


_ 


VI 4 3 





VI 4 4 





VI-48 


£-94^3 5 5 


VI 4 9 


- 


VI-5 0 


^#^3 5 6 


VI-53 


^^3 5 7 


VI 71 


— 


VI-74 


-9^-3 65 


VI-7 6 


90 73 6 7 


T T J TO 




T J J TO 




V 1 — o / 




VI — 8 8 


-9-1-2-3 7 1 






t 7 j rj o 




V 1 — o 


J ± 3 J 1 H. 














T 7 1 1 0 6 








V 1 1 — trt- 




VI I — 1 9 


J OjI / 1 


VI I — 2 1 


■§-6-4-1 7 3 


T 7 I I ° 5 


QO 


VI I — 3 6 


^"-TTT- 


VI 1 — 42 




VI 1—43 


OoUl o / 


VI 1 — 46 


c o o i q n 
joji y u 


VI I — 5 9 


5 931 9 g 


VI I — 6 3 


o u o z u _L 


VI I — 6 6 


c. o q o pi /i 

O 'J O Z U 4i 


T T J J 






cnno pi £ 

PUUZ U O 


VI I — 7 3 


■c pi I o n 1 
OU1Z u / 


T TT T T C; 




VI o ° 




VI 0 4 




VI 0 9 




VI 10 




VI-12 


■8-^-3-3 4 4 


VI-14 


3-^+3 4 5 


VI 17 
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Clone ID 


Sequence 

WSEQ ID 
NO. in 

Sequence 
Li s tinq 


VII-91 


613217 


VII-93 


^i^-219 


VIII 01 


_ 


VIII 0 2 


_ 


VIII 0 3 





VIII 0 6 





VIII-09 


^^221 


VIII 10 





VIII 15 


_ 


VIII 2 2 





VIII 2 6 


_ 


VIII-28 


t> 3 4 2 3 5 


VIII-30 


£ 3g237 


VIII-32 


^§-239 


VIII-33 


^^240 


VIII-41 


^4^2 4 5 


VIII-42 


£4^246 


VIII-48 


^4^2 51 


VIII 5 8 


— 


VI I I- 64 


6 632 6 1 


VI 1 1 — 




VI 1 1 — 6 7 


o o o z, o o 


T 7 J. 1 1 n ° 




T 7 ~L 1 1 00 




T 7 J. 1 1 ° ^ 




A 7 1 1 1 ° 5 




T 7 1 1 1 ° ^ 




T 7 1 1 1 0 1 




T 7 1 1 1 9 ° 




T 7 1 1 1 ° Q 




A 7 1 1 1 a 5 
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Table 4b 

List of sequences of probes informative for Alzheimer disease 
Ploaoc oco note to Table 1 



Clone ID 


Sequence IDSEQ 
ID NO. in 

Sequence Listing 






1-10 


2-9-9-6 


1-15 




I 16 


3 Q ]_ 


1-17 


^2-8_ 


1-19 




4— 




1-22 


W£10 


I 23 




1-24 




1-25 


^-912 


1-28 




1-31 


^4t4t14 




312 


1-34 


^4^15 


1-38 


31^1 6 


1-39 


^4-^17 


1-40 


^4^_1_8 


I 44 




I 45 


3 2_ q 


I 46 


^4-# 


I 47 




1-4 8 


3 2ii 9 


1-49 


^2-2 0 


1-53 


^^■21_ 


1-5 6 


^42 2 


1-57 


^2 3 


1-58 


^2 4 


1-60 


^2 5 


1-64 


3 2 5 2 6 


1-67 


33027 


1-69 


^4^2 8 


I 71 


3 3 2 


I 72 


333 




^ ^ /[ 


1-77 


^^.29 


I 7 9 




1-80 


3373 0 


1-81 


33S-31 


1-82 


^-9-32 


VI 0 2 


310 


VI 03 


^44- 


VI 0 4 


^43- 
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VI 0 6 


3 ^ 3 


VI 0 7 


444 


VI 0 9 




VI 0 9 


345 


VI II 


444 


VI 18 


349 


VI 1 9 


444 


VI 2 0 


35 q 


VI 21 


444 


VI 2 2 


3 5 2 


VI 2 5 


444 


VI 2 6 


444 


VI 2 7 


~iS 


VI 31 




VI 3 3 


35-7 


VI 3 5 


350 


V-i — 48- 


350 


11-02 


3£C33 


11-03 


4443 4 


11-05 


4443 5 


1 1 - 0 6 


3-443 6 


11-07 




11-08 


4443 8 


II 0 9 


3 g n 


1 1 - 1 0 


4443 9 


11-11 


44440 


11-12 


3-4441 


11-13 


4444 2 


II 14 


3 n 2 


11-15 


4443-4 3 


11-16 


4444 4 


II 17 


375 


II 18 


3 -7 5 


II 20 


444 


11-21 


4444 5 


II 22 


444 


11-23 


443-44 6 


11-24 


4444 7 


11-25 


4444 8 


11-26 


4444 9 


11-27 


4445 0 


II 28 


305 


11-29 


3g651 


11-30 


4445 2 


11-31 




11-32 


4445 4 


11-33 


44455 


11-34 


44456 


II 35 


444 


II 37 


303 


11-38 


44457 
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11-39 


5^45 8 


11-40 


5545 9 


11-41 


55560 


11-42 




11-43 


3-9-9-62 


11-44 


4M63 


II-4 6 


45464 


11-47 


455-65 


11-48 


453-66 


II 19 


454 


11-50 


45567 


11-52 


45468 


11-53 


45569 


11-54 


45570 


11-55 


45571 


II-5 6 


44-572 


11-57 


44-4-73 


11-58 


44-5-74 


11-5 9 


443-75 


11-60 


44-47 6 


11-61 


44577 


11-62 


4447 8 


11-63 


4457 9 


11-64 


44-8-8 0 


1 1 - 6 5 


44581 


11-66 


45582 


11-67 


454-83 


11-68 


45-3-84 


11-69 


453-85 


11-70 


4548 6 


11-71 


45587 


11-72 


45588 


11-73 


4558 9 


11-74 


45-8-90 


11-75 


454191 


11-7 6 


45592 


11-77 


45493 


11-78 


43-594 


11-7 9 


4543-95 


11-80 


45496 


11-81 


45497 


11-82 


45598 


II 83 


455 


11-84 


45599 


II 85 


455 


II 86 


445 


11-87 


444-10 0 


11-88 


445101 


II 89 


443 


II 90 


444 


II 91 


444 
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11-92 


444102 


11-93 


44410 3 


11-94 


44-8-104 


II 95 


444 


11-96 


444105 


111-01 


444106 


111-02 


44410 7 


111-03 


44410 8 


III 0 4 


444 


III 05 


444 


111-06 


44*10 9 


111-07 


444110 


111-08 


444111 


III-O 9 


444112 


III-ll 


444113 


111-12 


444114 


111-13 


444115 


III 14 


^ (Z ^ 


III 15 


444 


III 16 


444 


III 17 




III 18 


444 


III 19 


444 


111-21 


444117 


111-22 


444118 


111-23 


44411 9 


111-24 


444120 


111-25 


444121 


III-2 6 


444122 


111-27 


444123 


111-28 


444124 


III-2 9 


444125 


111-31 


44412 6 


111-32 


444127 


111-33 


444128 


111-34 


444129 


111-35 


44413 0 


III 37 


q o (z 


III-3 9 


444131 


1 1 1 - 4 0 


444132 


111-42 


44413 3 


111-43 


4445 0 0 


111-44 


444134 


111-45 


444135 


III-4 6 


444136 


111-47 


444137 


111-48 


444138 


I I 1-4 9 


444139 


111-50 


444140 


III 51 


444 


111-52 


444141 
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111-53 


^##142 


III 5 4 




111-55 


^-0-2-143 


III-5 6 


^■0^144 


1 1 1 - 5 7 


-§-0-4145 


111-58 


W^14 6 


111-59 


^-0-^147 


111-61 


,§-0-7-148 


111-62 


^-0-8-14 9 


111-63 


£04150 


111-64 


£4^9-151 


III 65 


^4-4 


111-66 


44^2-15 2 


111-67 


444^153 


III 69 


444 


111-70 


444154 


III 71 




— 3-3- 


444 


111-74 


44-8-155 


111-75 


^156 


III 77 


5 2 0 


111-78 


44415 7 


III 7 9 


444 


111-80 


444158 


1 1 1 - 8 1 


44-4159 


111-83 


6 0 


111-85 


4441 61 


III-8 6 


444152 


III 87 


5 2 8 


111-88 


434>163/164 


III-8 9 


^9-1 65 


III 91 




111-93 


4441 66 


111-94 


4441 6 7 


111-95 


^41 68 


4^4-4 — 




VII 0 2 


4-44 


VII 0 3 


5 Q 


VII 0 4 


4-44 


VII 0 5 


55 q 


VI I 0 6 




VII 0 7 





VII 0 8 





VII 0 9 




VII 10 




VII 11 


444 


VII 12 


444 


VII 14 




VII 15 




VII-17 


44416 9 


VII-18 


44417 0 
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VII-19 


#£2-171 


VI 1-2 0 


#£#172 


VII-21 


#£4173 


VII-22 


###174 


VI I -2 3 


###175 


VII-24 


###176 


VII-27 


###17 7 


VI I 2 9 


5 5 9 


VII-2 9 


###■178 


VII-32 


###179 


VII-33 


###18 0 


VI I 3 4 


### 


VII-35 


###181 


VII-3 6 


###182 


VII-3 9 


###183 


VII-40 


###184 


VII-41 


###185 


VII-42 


###186 


VII-43 


###18 7 


VII-44 


###18 8 


VII-45 


###189 


VII-4 6 


###19 0 


¥## — 4# 


### 


VII-4 9 


###191 


VII-50 


###192 


VII-52 


###1 93 


VII-53 


###1 94 


VII-54 


###1 95 


VII-55 


###196 


VII-57 


###19 7 


VII-58 


###198 


VII-59 


###199 


VII-62 


###2 0 0 


VII-63 


###2 01 


VII-64 


###2 0 2 


VII-65 


###2 0 3 


VII-66 


###2 0 4 


VII-71 


###2 0 5 


VII-72 


###2 0 6 


VII-73 


###2 0 7 


VII-74 


###2 0 8 


VII-76 


###2 0 9 


VII 7 7 


5 0 4 


VII-80 


###2 1 0 


VII-81 


###211 


VII-82 


###212 


VII 8 3 


5 Q Q 


VII-84 


###213 


VII-87 


###214 


VII-89 


###215 


VII-90 


###216 
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VII-91 


4*44*217 


VII-92 


444218 


VII-93 


£445-21 9 


V444 — 44 


g ]_ g 


VII-96 


^220 


VIII-09 


444*2 21 


VII-10 


444*2 2 2 


VII 11 


g 2 Q 


VII-12 


4442 2 3 


VII-13 


444*2 2 4 


VII 15 


444* 


VII-1 6 


4*442 2 5 


VII-17 


4442 2 6 


VII-18 


4442 2 7 


VII-19 


4442 2 8 


VII-20 


£448-2 2 9 


VII-21 


4444*2 3 0 


VII-23 




VII-24 


4434,232 


VII-25 


443442 3 3 


VII-28 




VII-29 


44343-2 3 6 


VII-30 


4*442 3 7 


VII-31 


4434-2 3 8 


VII-32 


4434*2 3 9 


VII-33 


4*44*2 4 0 


VII 3 4 


g q q 


VI I -3 6 


4*442 41 


VII-37 


444*2 4 2 


VII-38 


4*442 4 3 


VII-40 


4442 4 4 


VII-41 


4442 4 5 


VII-42 


4*442 4 6 


VII-43 


4*44-2 4 7 


VII-45 


4>48-2 4 8 


VII-4 6 


4444*2 4 9 


VII-47 


4442 5 0 


VII-48 


44*42 51 


VII-50 


4434*2 5 2 


VII-51 




VII-53 


44442 5 4 


VII-54 




VII-55 


4*442 5 6 


VI I -5 6 


4*442 5 7 


VII-57 


4*442 5 8 


VII 5 8 


44*4* 


VII-59 


4*44*259 


VI I- 60 


444260 


VII 61 


4*4*4 


VII-64 


4*443-2 61 


VII 65 


g g g 
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VII-66 


^44>2 62 


VII-67 


44&-G-2 63 


VII 6 9 




V4-4 — 


g g Q 


VII-70 


4^44*2 64 


VII-71 


4474*2 65 


VII-72 


44742 6 6 


VII-73 


444442 6 7 


VII-74 


4443-2 6 8 


VII-75 


4442 6 9 


VII-76 


44442 7 0 


VII-77 


4442 71 


VII 7 9 


444 


VII 7 9 


g n o 


VII-80 


4442 7 2 


IV 0 2 




IV-04 


4442 7 3 


IV- 13 


44442 7 4 


IV- 14 


4442 7 5 


IV- 17 


44442 7 6 


IV-28 


4442 7 7 


IV-31 


444742 7 8 


IV-32 


4442 7 9 


IV-38 


4449-2 8 0 


IV-40 


44442 81 


IV-42 


4442 8 2 


IV-44 


44442 8 3 


IV-47 


44442 8 4 


IV-55 


44442 8 5 


IV 5 6 


g g g 


IV- 61 


44442 8 6 


IV- 6 4 


44442 8 7 


IV- 6 5 


44943-2 8 8 


IV- 7 2 


4442 8 9 


IV- 7 3 


44442 9 0 


IV- 8 0 


m2 91 


IV- 8 5 


4442-2 92 


IV- 9 3 


44442 93 


IV- 9 5 


44442 94 


IV- 9 6 


44442 95 


V-0 3 


4444-2 9 6 


V-0 4 


4447-2 9 7 


V-0 7 


44442 9 8 


V-0 8 


44492 9 9 


V-0 9 


4443 0 0 


V-12 


4443 01 


V 19 


444 


V-2 0 


4443 0 2 


V-24 


444303 


V 37 


444 


V-40 


4443 0 4 
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V-41 


7^-4-8-3 0 5 


V-48 


7^4-#3 0 6 


V-5 7 


7^3 0 7 


V-61 


7^3 0 8 


V-64 


7^3 0 9 


"W" g ^ 


7^ 


V-7 4 


7^-2-4-310 


V-8 0 


3-3-6311 


V-81 


4^7-312 


V-8 7 


7^313 






VI-13 


■8-T-&3 4 2 


VI-14 


■8-7-4-3 4 3 


VI-16 


-8-7-^3 4 4 


VI-23 


■8-7-8-3 4 7 


VI-24 


■8-7-^3 4 8 


VI-28 


■8-8^-3 5 0 


VI-32 




VI 3 8 


QQ|C 


VI-39 


■8-8-7-3 5 2 


VI -4 5 


-848-43 5 3 


VI-4 6 


&W3 5 4 


VI-49 


■8-^2-5 01 


VI-50 


■8-^3 5 6 


VI 5 2 




VI-53 


■8-^§-3 5 7 


VI-54 


■8-^3 5 8 


VI-55 


■84^7-3 5 9 


VI-57 


■8-^8-3 6 0 


VI-58 


■84^3 61 


VI-63 


-444>3 62 


VI 65 


9Q2 


VI- 6 6 


#443-3 63 


VI-67 


64 


VI-74 


#445-3 65 


VI-75 


4>443 6 6 


VI-76 


W7-3 6 7 


VI-80 


#443-3 6 8 


VI 81 


#44> 


VI-85 


#4-43 6 9 


VI-87 


#4-4-3 7 0 


VI-88 


#4-4-3 71 


VI-91 


#4-4-3 7 2 


VI-94 


#4-43 7 3 


VI-95 


#4-43 7 4 


VI 9 6 


#4-4 


I 13 


1177 


1-14 


1178397 


1-30 


4-4-44>398 


1-54 


1181399 


1-88 


1182400 
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111-20 


1183401 


IV- 15 


1185402 


IV- 2 6 


118 6403 


IV 62 


113 7 


V-ll 


1188404 


IV- 2 8 


1189405 


IV- 3 8 


1190406 


IV- 4 5 


1191407 


VI-44 


1193409 


VII-47 


1200415 


1-42 


1332445 


I 52 


1333 


1-86 


133 6447 


1-95 


1337448 


III 10 


1342 


— 


13 47 


111-82 


13 48451 


111-92 




IV- 2 3 


1353454 


IV 3 4 


135 4 


IV- 5 5 




IV 41 


135 6 


IV 4 5 


1357 


IV- 8 2 


1359456 


V-01 


13 614 5 8 


V 02 


1362 


y Q (Z 


1363 


V-ll 


1364459 


V-25 


13 654 60 


V-3 5 


1366461 


V-42 


13674 62 


V-47 


1368463 


V-4 9 


13694 64 


V-5 8 


1370465 


V-7 5 


13724 67 


V 7 9 


1373 


V-9 0 


137 44 6 8 


V 91 


1375 


V 9 4 


2_ g n f 


VI 10 


1379 


VI 41 


1381 


VI-43 


1382471 


VI-71 


1387472 


VI 7 2 


poo 


VI-79 


138 9473 


VI-90 


1390474 


VI-93 


1391475 


VI 1-2 5 


13 9 7 4 8 0 


VII 6 0 




VII-67 


1399481 


VIII 2 2 
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VI 1 1 2 6 


1404 


VI 1 1 3 9 


1405 


VIII — 44 


1406 


1-37 


1440482 


V 3 2 


14 45 


V-5 2 


1447483 


V-6 8 


1448484 


V-92 


1449485 


VI-42 


14 5 0 4 8 6 


VI 7 8 


1452 


VII-86 


1453487 


VII-88 


1454488 


IV-29 


1^90491 


V-15 


4^#^491 


V-3 9 


1492493 


V-54 


1493494 


V-5 9 


14 944 95 


V-71 


1495496 
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Table 5 



Samples 



Diagnosis 


No. of women 


Normal /Benign 


42* 


DCIS 


3 


Invasive cancer 


26 



*From one woman, whole blood was collected at weeks 1,2,3,4,5 following menstruation. 
Hence, the number of unique normal /benign samples tested in the experiment is 75. 



Information about women with breast cancer 



Sample 


AGE 


Stage 


Cancer type 


Size hist, 
(mm) 


Nodes 


1 


51 


II 


IDC 


20 


1/7 


2 


84 


II 


IDC 


22 


2/2 


3 


50 


I 


DCIS + 
1 IDC 


>50 DCIS; 
5 x 14 


0/7 


4 


47 


I 


IDC 


15 


0 


5 


69 


III 


ILC g.2 + tubular 
adenocarcinoma 


50 + 3 


1 av 12 + 1 av 7 


6 


50 


II 


IDC 


24 


0 


7 


65 


I 


IDC 


15 


0 


8 


63 


II 


IDC 


23 


0 


9 


55 


I 


IDC + DCIS 


4 


0 av 1 


10 


52 


0 


DCIS + small 
colloid carcinoma 
foci 


50 + 3 


0 


11 


60 


II 


IDC 


24 


0 


12 


54 


I 


IDC 


11 


0 


13 




0 


DCIS 


20 


0 


14 


49 


0 


DCIS 


9 


0 


15 


48 


I 


IDC 


4 


0 


16 


56 


I 


IDC 


4 


0 


17 








14 




18 


68 


I 


IDC 


7 


0 


19 


63 


I 


IDC 


10 


0 


20 


45 


I 


IDC 


19 


1 


21 


57 


III 


IDC 


60 


8/20 
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22 


55 


II 


IDC/DCIS 


35 + 55 


0 


23 


7 1 


I 


I DC /extensive 
DCIS 


8 


0 


24 


5 6 


I 


IDC 


9 




25 


6 6 


1 1 


I DC 


2 6 


0 


Z. b 


b b 


I 


I DC 


1 5 




27 


61 


I 


IDC 


9 


2 


28 


? 


? 


? 


? 


? 


29 


65 


I 


IDC 


11 


0 



Other diseases /conditions present in the women tested 

Other diseases/conditions present in the women tested 

Disease /condition 



Diabetes 



Asthma 



Ulcerous colitis 



Hemochromatose 



Crohn's disease 



Fibromyalgia 



Psoraiasis 



Rheumatism 



Allergies 



Prior history of cancer in the women tested 



Cancer type 


No. of women 


Breast 


3 


Colon 


2 


Stomach 


1 


Skin 


1 
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Number of samples tested by double cross validation and success of the diagnostic test 
for breast cancer based on selected ionformative genes 
Number of samples tested by double cross validation 



Number of unigue samples tested 


75 


Number of unigue non cancer samples tested 


46 


Number of cancer samples tested 


29 



Success of the diagnostic test for breast cancer based on selected informative genes 



Occurrence in 
percentage* 


Number of 
informative 
probes 


opeciiicity 


sensitivity 


/ITT 

Accuracy 


False 
Positive 
rate 


False 
negative 
rate 


Total error rate 


100.00 


23 


84.78 


75.86 


81.33 


15.22 


24.14 


18.67 


90.00 


44 


91.30 


79.31 


86.67 


8.70 


20.69 


13.33 


80.00 


51 


86.96 


79.31 


84.00 


13.04 


20.69 


16.00 


70.00 


54 


89.13 


75.86 


84.00 


10.87 


24.14 


16.00 


60.00 


58 


89.13 


75.86 


84.00 


10.87 


24.14 


16.00 


50.00 


59 


89.13 


75.86 


84.00 


10.87 


24.14 


16.00 


40.00 


63 


89.13 


75.86 


84.00 


10.87 


24.14 


16.00 


30.00 


66 


86.96 


75.86 


82.67 


13.04 


24.14 


17.33 


20.00 


74 


89.13 


75.86 


84.00 


10.87 


24.14 


16.00 


10.00 


79 


89.13 


75.86 


84.00 


10.87 


24.14 


16.00 


5.00 


90 


86.96 


79.31 


84.00 


13.04 


20.69 


16.00 


1.33 


139 


84.78 


72.41 


80.00 


15.22 


27.59 


20.00 



*100% = Genes appearing in all the 75 cross validated models; 90% = Genes appearing in at least 68 out of 75 cross validated models; 
5% = Genes appearing in at least 4 out of 75 cross validated models; and so on. 
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Table 7 



Double cross-validation and details of the success of the diagnostic test for 
Alzheimer disease based on the expression 182 informative genes 



Validation Result 



Success of diagnostic test 



Total number of samples 
tested 


14 


Number of Alzheimer's 
disease samples tested 


7 


Number of Alzheimer's 
disease samples incorrectly 
predicted 


1 


Number of non-Alzheimer' s 
disease samples tested 


7 


Number of non-Alzheimer's 
disease samples incorrectly 
predicted 


0 



Performance 


Description 




Accuracy 


Percentage of the total number of 
predictions that were correct 


92.9 


Sensitivity 


Percentage of positive cases that 
were correctly identified 


85.7 


Specificity 


Percentage of negatives cases 
that 

were correctly predicted 


100 


False positive 
rate 


Percentage of negatives cases 
that 

were incorrectly classified as 
positive 


0.0 


False negative 
rate 


Percentage of positive cases that 
were incorrectly classified as 
negative 


14.3 


Total error rate 


Percentage of the total cases 
incorrectly predicted 


7.1 
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Some relevant features of the blood donors. B, Female donors with breast cancer; N, Female donors with suspected mammogram but no 

breast cancer; IDC, invasive ductal carcinoma; DCIS, ductal carcinoma in situ; na, not available nd, not determined; ++, no degradation 
of mRNA and no ribosomal contamination in the sample, +, no degradation of mRNA but ribosomal contamination in the sample. 







AGE 


Cancer type/ 

breast 
abnormality 


Size Hist, 
(mm) 


mRNA 
Quality 


1 


Bl 


na 


IDC 


5 


++ 


2 


B2 


49 


DCIS 


8 


nd 


3 


B3 


54 


IDC 


18 


++ 


4 


B4 


59 


IDC 


12 


+ 


5 


B5 


61 


DCIS + micro 
invasive cancer 


15+1.5 


++ 


6 


B6 


55 


IDC 


12+17 


nd 


7 


B6 




IDC 


12+17 


nd 


8 


Nl 


45 


Fibroadenoma 




nd 


9 


N2 


52 


na 




+ 


10 


N3 


55 


Cyst 




++ 


11 


N4 


54 


na 




++ 


12 


N5 


51 


Benign ductal 
epithelium 




nd 


13 


N6 


57 


Benign 




nd 
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14 


N7 


50 


na 




++ 


15 


N8 


52 


na 




+ 
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Table 9 

List of sequences of probes informative for both alzheimer and breast cancer disease 





ID NO. in 

Sequence Listing 


1-24 


3&&11 


1-25 




1-28 


^44>13 


1-48 


4^2-4-19 


1-60 


4^425 


I 72 


444 


1-81 


44431 


1-82 


4443 2 


11-02 


4443 3 


11-03 


4443 4 


11-06 


4443 6 


11-07 


4£437 


11-10 


4443 9 


11-21 


4444 5 


11-23 


4444 6 


11-24 


4444 7 


11-25 


4444 8 


11-27 


4445 0 


11-33 




11-34 


4445 6 


11-41 


44460 


11-42 


44461 


11-46 


44464 


11-47 


44565 


11-48 


44466 


11-52 


44468 


11-57 


44-4-73 


11-58 


44-5-74 


11-59 


44475 


11-60 


44-47 6 


11-61 


444577 


11-62 


44-478 


11-64 


4448 0 


11-67 


424-83 


11-69 


45485 


11-70 


4548 6 


11-74 


45490 


11-80 


44496 


11-82 


44498 


11-84 


4449J3 


11-87 


444-10 0 


11-88 


442-101 
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11-96 


444105 


111-01 


44410 6 


111-02 


444107 


1 1 1 - 0 6 


444109 


111-08 


4^111 


111-12 


444114 


111-13 


444115 


III 17 




111-18 


444116 


111-21 


444117 


111-23 


444119 


111-24 


44412 0 


111-25 


444121 


111-26 


444122 


111-27 


444123 


111-28 


444124 


111-29 


444125 


111-32 


444127 


111-33 


444128 


111-35 


444130 


111-39 


444131 


111-40 


444132 


111-42 


444133 


111-45 


444135 


III-4 6 


44413 6 


111-47 


444137 


111-48 


444138 


111-56 


444144 


111-57 


444145 


111-58 


444146 


111-59 


444147 


111-61 


444148 


111-62 


444149 


111-63 


444150 


111-64 


444151 


111-66 


444152 


111-67 


444153 


111-70 


444154 


111-74 


444155 


III-5 


444156 


111-78 


444157 


111-80 


444158 


1 1 1 - 8 1 


444159 


111-85 


444161 


111-86 


444162 


111-88 


444163/164 


111-89 


444165 
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111-93 


4>^4416 6 


111-95 


44^4168 


III 96 




IV- 0 4 


44444273 


IV- 13 


444^3-274 


IV- 14 


4S&4275 


IV- 17 


44444276 


IV-31 


4^278 


IV- 3 2 


£44442 7 9 


IV- 3 8 


4444442 8 0 


IV- 4 2 


4444282 


IV- 4 7 


44442 8 4 


IV- 61 


44442 8 6 


IV- 6 4 


44442 8 7 


IV- 7 2 


444492 8 9 


IV- 8 0 


44442 91 


IV- 8 5 


44444292 


IV- 9 3 


4444293 


IV- 9 6 


44442 95 


V- 0 3 


444442 9 6 


V-0 4 


44442 9 7 


V-0 7 


444442 98 


V-0 8 


444442 9 9 


V-12 


4443 01 


V-24 


444303 


V-41 


44443 0 5 


V-57 


4444307 


V-61 


44443 0 8 


V-64 


44443 0 9 




7 2 3 


V-7 4 


4444310 


V-80 


4444311 


VI 0 3 


444 


VI 0 4 




VI 0 7 


444 


VI 0 9 


444 


VI 0 9 


g q g 


VI-12 


44444341 


VI-14 


44443 4 3 


VI 19 


444 


VI 2 0 


4444 


VI 21 


444 


VI-23 


444443 4 7 


VI 25 


353 


VI 2 6 




VI 4 8 


g ^ o 


VI -5 0 


444443 5 6 
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VI-53 


-8-94^-3 5 7 


VI-74 


■9^3 65 


VI-76 


49447-3 6 7 


VI-87 


^94-4-3 7 0 


VI-88 


494-^-3 71 


VI-95 


■94-^3 7 4 


VII 0 2 


4>4-4 


VII 0 3 


^ /j Q 


VII 0 6 


4>4>4- 


VII 0 8 


4^ 


VII 0 8 




VII 10 


4>4>4> 


VII 11 




VII 15 


5 5 9 


VII 17 


5 gQ 


VII-19 


^a-i7i 


VII-21 


4>4^173 


VII-22 


4>4^174 


VII-23 


4^175 


VII-24 


44€-7-17 6 


VII-27 


4>4^-17 7 


VII-29 


4>7-4417 8 


VII-32 


4^7-4-179 


VII-33 


^180 


VII-36 


W^182 


VII-28 


4>4M4183 


VII-41 


W^l_85 


VII-42 


^7-9-18 6 


VII-43 


4>44£18 7 


VII-46 


4>4444190 


VII 4 8 


5 g 4 


VII-49 


4>444>191 


VII-54 


4>4449195 


VI 1-5 7 


■§-94-1 97 


VI 1-5 8 


4>49£198 


VII-59 


4>49^199 


VII-62 


4>4942 0 0 


VII-63 


4*4*4>2 01 


VI I- 64 


^49^2 0 2 


VII-66 


4>49&2 0 4 


VII-72 


£440-2 0 6 


VII-73 


4>O4-2 0 7 


VII 7 7 


5 0 4 


VII-80 


4^4>210 


VII-82 


4^7-212 


VI 1-8 7 


44t0214 


VI I- 90 


44444216 
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VII-91 


^«217 


VII-92 


£4-4218 


VII-93 


£4-^219 


VII-96 


W220 


VIII-09 


£4-£221 


VIII-10 


-&^-»2 2 2 


VIII-13 


-£^2-2 2 4 


VIII-16 


^4225 


VIII-20 


£££229 


VIII-21 


££-9-230 


VIII-23 


£££231 


VIII-24 


££4-2 3 2 


VI 11-25 


£££2 3 3 


VIII-28 


££42 3 5 


VIII-29 


£££2 3 6 


VIII-30 


£££237 


VIII-31 


£££238 


VIII-32 


£££239 


VIII-33 


£££2 4 0 


VIII 3d 


£4£ 


VIII-38 


£4£243 


VIII-40 


£442 4 4 


VIII-41 


£4£245 


VIII-46 


£4£2 4 9 


VIII-48 


£££2 51 


VIII-55 


£££2 5 6 


VIII-57 


£££258 


VIII-59 


£££2 5 9 


VIII-60 


£££260 


VIII 61 


(Z (Z 2 


VIII-64 


£££2 61 


VIII-66 


£££2 62 


VIII-73 


£££2 6 7 


VIII-74 




VI 1 1-7 6 


4^270 


VIII-80 


£££2 7 2 



Nucleotide sequences 
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Sequence ID — — £3- SEQ ID NO: 1 nt : 4 05 

GGATCCTGTGGCCCACAGAGCTGCCCCAGCAGACGCTCCGCCCCACCCGGTGATGG 
AGCCCCGGGGGGACAATCGTGCCTGGGGAGGAGCAGGGTACAGCCCATTCCCCCAG 
CCCTGGCTGACCTGGCCTAGCAGTTTGGCCCTGCTGGCCTTAGCAGGGAGACAGGG 
GAGCAAAGAACGCCAAGCCGGAGGCCCGAGGCCAGCCGGCCTCTCGAGAGCCAGAG 
CAGCAGTTGAATGTAATGCTGGGGACAGGCATGCTGCCGCCAGTAGGGCGGGGACC 
CGGACAGCCAGGTGACTACCAGTCCTGGGGACACACTCACCATAAACACATCCCCA 
GGCAGGACAGATCGGGGAAGGGGTGTGTACCAGGCTATGATTTCTCTTGCATTAAA 
ATGTATTATTATT 

Sequence ID — 1Q8 SEQ ID NO: 2 nt : 55 0 

GGC T T T GAC AG AGT GC AAGAC GAT GAC T T GC AAAAT GT CGC AT C T GGAACGC AAC A 
TAGANACCATCATCAACACCTTCCACCAATACTCTGTGAAGCTGGGGCACCCAGAC 
AC C C T G AAC C AG G G G G AA T T C AAAG AG C T G G T G C G AAAAG A T C T G C AAAAT T T T C T 
CAAGAAGGAGAATAAGAATGAAAAGGTCATAGAACACATCATGGAGGACCTGGACA 
CAAATGCAGACAAGCAGCTGAGCTTCGAGGAGTTCATCATGCTGATGGCGAGGCTA 
ACCTGGGCCTCCCACGAGAAGATGCACGAGGGTGACGAGGGCCCTGGCCACCACCA 
TAAGCCAGGCCTCGGGGAGGGCACCCCCTAAGACCACAGTGGCCAAGATCACAGTG 
GCCACGGCCACGGCCACAGTCATGGTGGCCACGGCCACAGCCACTAATCAGGAGGC 
CAGGCCACCCTGCCTNTACCCAACCAGGGCCCCGGGGCCTGTTATGTCAAACTGTC 
TTGGCTGTGGGGCTAGGGGCTGGGGCCAAATAAAGTCTCTTTCTCC 

Sequence ID 110 SEQ ID NO: 3 

ACGAAGACAGACATCTGTGGAATGATTCACATCCTCTCAAGTTAGGAGGATGGAGG 
CCTGCTTCATTAAGAAGCTGGGGGTAGGGTGGGGGTGGGGAGAACACTTAACAACA 
TGGGGACCAGTCAGGGGAATCCCCTTATTTCTGTTTTGCATATGAGGAACCCTAGA 
GCAGCCAGGTGAGGCTCTCTAGTTTAATAAAAATCATGGAAAGACTCTTAATGCAG 
ACTCTTCTTAAGTGTTAATAGGGATTTTTTCAGCTTATTTTGGTTGCAGTTTCCAA 
TTTTTAAAAATGTTGAGGTAATCTTTCCCACCTTCCCAAACCTAATTCTTGTAGAT 
GCATTAGTGTTGAACCAATGCTTTCTCATGTCTCAATTCTTTGTATATGCATTCTT 
T T C AG AT G T AT T AAAC AAAC AAAAAC C C T T C 

Sequence ID 3r»2- SEQ ID NO: 4 nt : 28 6 

CCGGTAATAGAATAGAAAAGGGAGAGTGTCTTCATGCAATGTGGCATCCTGGATTG 
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GGTCTCGNNACAAAAACAGGACATTAGTGGGAAAATTGGAAATCTGAAAAAAGTCT 
GAATTTTAGTTAATATACCAATTTCAGTCTCTTGGTTTTGACAGATGTACCATGGT 
GATGTAAGATGTTGACCTTGGGGTAGGCTGGGTGAAGGGTATACAGGAACTCTTTG 
T AC TAT C T C T GC AAC T T C T C T G T AAAT C TAG TAT CAT T C C AAAAT AAAAG T T TAT T 
TAATTT 

Sequence ID 250 SEQ ID NO: 5 

GTGGAAGTGACATCGTCTTTAAACCCTGCGTGGCAATCCCTGACGCACCGCCGTGA 
TGCCCAGGGAAGACAGGGCGACCTGGAAGTCCAACTACTTCCTTAAGATCATCCAA 
CTATTGGATGATTATCCGAAATGTTTCATTGTGGGAGCAGACAATGTGGGCTCCAA 
GCAGATGCAGCAGATCCGCATGTCCCTTCGCGGGAAGGCTGTGGTGCTGATGGGCA 
AGAACACCATGATGCGCAAGGCCATCCGAGGGCACCTGGAAAACAACCCAGCTCTG 
GAGAAACTGCTGCCTCATATCCGGGGGAATGTGGGCTTTGTGTTCACCAAGGAGGA 
CCTCACTGAGATCAGGGACATGTTGCTGGCCAATAAGGTGCCAGCTGCTGCCCGTG 
CTGGTGCCATTGCCCCATGTGAAGTCACTGTGCCAGCCCAGAACACTGGTCTCGGG 
CCCGAGAAGACCTCCTTTTTCCAGGCTTTAGGTATCACCACTAAAATCTCCAGGGG 
C AC CAT T G AAAT C C T GAG T GAT G T GC AC T GAT C AAG AC T GG 

Sequence ID 299 SEQ ID NO: 6 

CAGCGCAGGGGCTTCTGCTGAGGGGGCAGGCGGAGCTTGAGGAAACCGCAGATAAG 
TTTTTTTCTCTTT GAAAG AT AGAGAT T GNT AC AAC T AC T T A A A A A A T A T A G T C A A T 
AGGTTACTAAGATATTGCTTAGCGTTAAGTTTTTAACGTAATTTTAATAGCTTAAG 
ATTTTAAGAGAAAATATGAAGACTTAGAAGAGTAGCATGAGGAAGGAAAAGATAAA 
AGGTTTCTAAAACATGACGGAGGTTGAGATGAAGCTTCTTCATGGAGTAAAAAATG 
TATTTAAAAGAAAATTGAGAGAAAGGACTACAGAGCCCCGAATTAATACCAATAGA 
AGGGCAATGCTTTTAGATTAAAATGAAGGTGACTTAAACAGCTTAAAGTTTAGTTT 
AAAAG T T G T AGGT GAT T AAAAT AAT T T G AAGGCGAT C T T T T AAAAAGAGAT T AAAC 
CGAAGGTGATTAAAAGACCTTGAAATCCATGACGCANGGAGAATTGCGCATTTAAA 
GCCTAGTTACGCATTTACTAAACGCAGACGAAAATGGGAAGATTAATTGGGAGTGG 
T AGGAT GAAAC AAT T T T GGAGAAGAT AG AAG 

Sequence ID 300 SEQ ID NO: 7 

C T C AAAGG AG AAAAAAAAC C T T G T AAAAAAAGC AAAAAT G AC AAC AG AAAAAC AAT 
CTTATTCCGAGCATTCCAGTAACTTTTTTGTGTATGTACTTAGCTGTACTATAAGT 
AGTTGGTTTGTATGAGATGGTTAAAAAGGCCAAAGATAAAAGGTTTCTTTTTTTTT 
CCTTTTTTGTCTATGAAGTTGCTGTTTATTTTTTTTGGCCTGTTTGATGTATGTGT 
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GAAACAATGTTGTCCAACAATAAACAGGAATTTTATTTTGCTGAGTTGTTCTAAAA 

AAAAAAAAAAAAAAAAA 

Sequence ID 302 SEQ ID NO: 8 

AGTAGAGACGGGGTTTCACTGTGTTAGCCAGGATGGTCTCGATCTCCTGACCTCGT 
GATCCGGCCACCTCGGCCTCCCGAAAGTGCTGGGATTACAGGCGTGAGCCACGGCG 
CCCAGCCCCAGCCTGTCACTTAAACTGATAAACGACAGATTAACAGTAGAAAAATT 
TTATTTTGCATACATAATGAGGCTTCACAAAAGAGAAGTGAAAACCCAAGTAGGAG 
TTTAGGGCTGGGGGCTTATATACCATTTAACAAGGGGTGATAAATTGTAAGAGAAT 
AG 

Sequence ID 304 SEQ ID NO: 9 

TCCTTGGTTTCGATTTGTGGCAACAATCCAGTCTTTTTGTTTTTTTCAGGGATACC 
ATATGTAACAGGTGCCATTGTTACTGTAACTTTTCACACATGCCTTCAGTTTGATG 
TCAAAGTCATCATTTAGTGTAAACAGCAAGTTATCTGTTAGGCTGCACATCATGAA 
CTTTACTTTTAGAAAGTCTTATCTTTTATGCCACAGAAATAGCATTTGGCTATTAG 
TCATGGATGGCAAAGAAATTAATTTTGAGTTGTTTGGATAAAAATGTTTCAGTTGA 
CTGTAGTGTGTATTGAGAGACACTGCCAGTAAACAAACTCTCTTGGTAGGTGGAAA 
TCCCCTAGAAGTTACAGAAAATTGGGAGGAGGTGAACTTAATTAAATAACTTGAAT 
TGTT TAG AC AT AT T C AG AGC T T C T T AT G AC C T T GAAGAAAT C AC C C AAC T T CAAAA 
G AC CTCGGTTTCTT CAT T T GT AAAAT T AGGGAGT T T GAC T AGAT G T GT AAAT C T AG 
T T G T T AG T T AAC T T C T AAG A T G T AAAAAC CCTCTTGTTT AAC AAAAAC C T AC AAG A 
TCAAGTTGCTTATCTGAAATCTTTATGAATCAACACTAGTCACTAAGTCTAGCTCG 
ACC 

Sequence ID 3Q6 SEQ ID NO: 10 

CTTTTCCTCCCGCTGTCCCCCACGGAGGGGACTGCTCTCCCCCGCTGCATCCTTTC 
TGTGAGGTACCTTACCCACCTCAGCACCTGAGAGGGTGAAATAGAATTCTAACCTC 
GACATTCGGGAAGTGTTTTTGAGAAGTCTCGGTCGGTAAGGGAAGTCTTCCAAGTC 
CGTGCAGCACTAACGTATTGGCACCTGCCTCCTCTTCGGCCACCCCCCAGATGAGG 
CAGCTGTGACTGTGTCAAGGGAAGCCACGACTCTGACCATAGTCTTCTCTCAGCTT 
CCACTGCCGTCTCCACAGGAAACCCAGAAGTTCTGTGAACAAGTCCATGCTGCCAT 
CAAGGCATTTATTGCAGTGTACTATTTGCTTCCAAAGGATCAGGCCCTGAGAACAA 
TGACCTTATTTCCTACAACAGTGTCTGGGTTGCGTGCCAGCAGATGCCTCAGATAC 
CAAGAGATAACAAAGCTGCAGCTCTTTTGATGCTGACCAAGAATGTGGATTTTGTG 
AAGG AT GCNC AT GAANAAAT GGACNAGC T GT G 
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Sequence ID 3-9-8- SEQ ID NO: 11 nt : 373 

AAGTGGGTCTTGCCATCCCTGAACTGNAATCATCCCTAACATATTCATACCTGTTT 
TCATTTTAAAAGTTGGGTCAGTTTTTTTATTAGTACATGTATTTCTATCCTACTGA 
TTTATTTGCTATATCATCTAATTTAGTTTGAATATTCCATAATTTACTTAATTAGT 
CCTGTATGGAGACCTAGCTCTTCTCAGTGTCTACTATTATAAACAATGCTACAGTG 
AATATTGGTGNATAAATCCATACNCACCACGTACATATCTTAAGTTCTGGAAGAGA 
TATTGCTAAACCAGAAGATAACCTGCATTTAAAATTTGACTGCTAGGGNCAGGGNC 
AC AT T TAAT TAAAT T AGAAC AANGAAT GC AT AAT GNC 

Sequence ID 309 SEQ ID NO: 12 

CCGGAATCGCGGCCGCGTCGACGAAAATATGTGCCCTGGCCAACTCCACAGGACTA 
GTT C T AGGC AAT C T G AAGG AAAC C AG AAAAT G T G AAT TTCTCTTCCCT C AAAAAGC 
TAT AC T GAAGT AGT AT T TAAT AT T C AAGT AC T T GT AAAT T T GC AGAAC AGT AC T T T 
TTAATTTGACCCATGAATTCTATTTAAATTTGTCACTTAATATTTAGCCAAGAAGC 
AAACCATCTAAAAAGATTTCTGGTTTATTTCTCCAACTCCTAATAAATAGGGTCAC 
ATATTTTTTAACTTTTTTCTAATTTGAAAAGTAATACAGGCATATGGTATTTTAAA 
AATGAAACAACACAAAGGGATATGTTTTGAAAAGTGGTCTTGCCATCCCTGAACTG 
TAATCATCCCTAACATATTCATACCTGTTTTCATTTTAAAAGTTGGGTCAGTTTTT 
TTATTAGTACATGTATTTCTATCCTACTGATTTATTTGCTATATCATCTAATTTAG 
TTTGAATATTCCATAATTTACTTAATTAGTCCTGTATGGAGACCTAGCTCTTCTCA 
GTGTCTACTATTATAAACAATGCTACAGTGAATATTGGTGNATAAATCCTACACAC 
CACGTAACATATCTTAAGTTCCTGGAAGAGATATTGCTAAACCAGAAGATAACCTG 
C AT T T AAAAT T T GAC T GC T AGGGT C AGGGT C AC AT TT AAAT T AAAT T AGAAC AAGG 
AATGCATAATGTCTTCGATAGCAATCTATTCAAGGTGCACCGTGGTCACAAAGGAA 
AG C AAAAC T G T C 

Sequence ID 3^9- SEQ ID NO: 13 nt:564 

C C T GGNC AGAGGC C T C T AT C C T GT ANT GAT AAT T GCC AT C AAAAT T GT C AAAAANG 
ATTTAATTTCTATGGGNAATAGTCCTTTTCTTAGCTTCTGCCNNTCACTTGCTTAT 
TTTTTGTGTGGGAATGGGGTTGGATAAACCAATGAACTTTATTATAAACAAATCCC 
ACCTATATCTANCAAATTTATATTTTCGGTGAAATACAGATATTTGCCTTTCTGGA 
GTANTATAGAAGCTGTCAATATGTATCTACTGTACAGTACTAAATAGTATTCATTT 
ATGAAATGAGTAGTGTTTGGGTGGCTGGGGTTAAGGAAAAATGAGACTTGGAATTG 
TAGCTTTTATCCAAGTTTTGAGTATAAATAGGGTTTTGTTTTGTTTTTTTTAACCT 
AAAAACTGAAATGCCATATAGAAAAACAGCATTGTTTTTACAGTTTGTAGTAAGTA 
AC T T T T T AAAG AT T T T AT C AAAAAGAAT TTTGTCTATNGT G AG T AAAAG AAG T T C T 
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AATAATGGCCTAATCACTGCATTTTTAAAAAACAAAGTTCAACACAAATGACATTT 
GTTT 

Sequence ID 311 SEQ ID NO: 14 

CCTCTCCTCCATCTAAAGGCAACATTCCTTACCCATTAGTCTCAGAAATTGTCTTA 
AGCAACAGCCCCAAATGCTGGCTGCCCCCGGCCAAGCATTGGGGCCGCCATCCTGC 
CTGGCACTGGCTGATGGGCACCTCTGTTGGTTCCATCAGCCAGAGCTCTGCCAAAG 
GCCCCGCAGTCCCTCTCCCAGGAGGACCCTAGAGGCAATTAAATGATGTCCTGTTC 
CATTGG 

Sequence ID 3^- SEQ ID NO: 15 nt : 554 

CCCGGAATCGCGGCCCGCGTCGACAACAAACCTGCATGTTCTGCACATGTATCCAG 
G AAC T T AAAAAAAAAAAAAG AT AG TTTGTGTGTCT TAAT T G AAT AAT AG T AG AT T T 
ATAGATTAAAGATCTATGGGTTTTTAATATGGATTANAAATCTGTGGGTTTTTGAT 
ATGGATTANAAATCTGTGGGTTTTTAATATGGATTGGAAATCTGTGGGTTTTTAAT 
ATGGATTAAAAAACATCTGTGGGTTTTTAATATGGATTAAACATCTGTGGGTTTTT 
AATATGGATTAAACATCTGGGTTTTTAATATGGATTAAACATCTGTGGGTTTTTAA 
TATGGGTTAAAAATCAAAAGAAAATGAACTATTTGCTCCAGTGCAGGAAAATACAG 
GCAATACTGGATACAATTAGATGGTCAGGAGCGATAACCCGGTTGCCATTGTTTGA 
AGAAGAGAATAAGGNGCTAGCATTCCTATCCGTAGATAATTTGACAGCTAGGAAAT 
AGGGGGAGTCTTCTATGTAGTTAGTGAAGGCTAAATGAACTATTATATGC 

Sequence ID 314 SEQ ID NO: 16 

CTTTTCCTCCCGCTGTCCCCCACGGAGGGGACTGCTCTCCCCCGCTGCATCCTTTC 
TGTGAGGTACCTTACCCACCTCAGCACCTGAGAGGGTGAAATAGAATTCTAACCTC 
GACATTCGGGAAGTGTTTTTGAGAAGTCTCGGTCGGTAAGGGAAGTCTTCCAAGTC 
CGTGCAGCACTAACGTATTGGCACCTGCCTCCTCTTCGGCCACCCCCCAGATGAGG 
CAGCTGTGACTGTGTCAAGGGAAGCCACGACTCTGACCATAGTCTTCTCTCAGCTT 
C C AC TGCCGTCT C C AC AG G AAAC C C AG AAG T T C T G T G AAC AAG T C C AT GC T G C C AT 
CAAGGCATTTATTGCAGTGTACTATTTGCTTCCAAAGGATCAGGCCCTGAGAACAA 
TGACCTTATTTCCTACAACAGTGTCTGGGTTGCGTGCCAGCAGATGCCTCAGATAC 
CAAGAGATAACAAAGCTGCAGCTCTTTTGATGCTGACCAAGAATGTGGATTTTGTG 
AAGGATGCACATGAAGAAATGGAGCAGGCTGTGGAAGAATGTGACCCTTACTCTGG 
C C T C T T G AAT GAT AC T G AGG AG AAC AAC T C T G AC AAC C AC AAT CAT G AGG 



Sequence ID 315 SEQ ID NO: 17 
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TGGTACAGATACAAACTGGACTCTCAGGACAAAACGACACCAGCCAAACCAGCAGC 
CCCTCAGCATCCAGCAGCATGAGCGGAGGCATTTTCCTTTTCTTCGTGGCCAATGC 
CATAATCCACCTCTTCTGCTTCAGTTGAGGTGACACGTCTCAGCCTTAGCCCTGTG 
CCCCCTGAAACAGCTGCCACCATCACTCGCAAGAGAATCCCCTCCATCTTTGGGAG 
GGGTTGATGCCAGACATCACCAGGTTGTAGAAGTTGACAGGCAGTGCCATGGGGGC 
AACAGCCAAAATAGGGGGGTAATGATGTACGGGCCAAGCACTGCCCAGCTGGGGGT 
CAATAAAGTTACCCTTGTACTTG 

Sequence ID 316 SEQ ID NO: 18 

CGCCACTTATCCAGTGAACCACTATCACGAAAAAAACTCTACCTCTCTATACTAAT 
C T C C C T AC AAAT C T C C T T AAT T AT AAC AT T C AC AGC C AC AG AAC T AAT CAT AT T AA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

Sequence ID 321 SEQ ID NO: 19 

CAGAACAGTACTTTTTAATTTGACCCATGAATTCTATTTAAATTTGTCACTTAATA 
TTTAGCCAAGAAGCAAACCATCTAAAAAGATTTCTGGTTTATTTCTCCAACTCCTA 
ATAAATAGGGTCACATATTTTTTAACTTTTTTCTAATTTGAAAAGTAATACAGGCA 
TATGGTATTTTAAAAATGAAACAACACAAAGGGATATGTTTTGAAAAGTGGTTCTT 
GC C AT C C C T G AAC T G T AAT CAT C C C T AAC AT AT T CAT AC C T G T T T T CAT T T T AAAA 
GTTGGGTCAGTTTTTTTATTAGTACATGTATTTCTATCCTACTGATTTATTTGCTA 
TATCATCTAATTTAGTTTGAATATTCCATAATTTACTTAATTAGTCCTGTATGGAG 
ACCTAGCTCTTCTCAGTGTCTACTATTATAAACAATGCTACAGTGAATATTGGTGT 
ATAAATCCATACACACCACGTAACATATCTTAAGTTCCTGGAAGAGATATTGCTAA 
ACCAGAAGATAACCTGCATTTAAAATTTTGACTGCTAGGGTCAGGGTCACATTTAA 
ATTAAATTAGAACAAGGAATGCATAATGTCTTCGATAGCAATCTATTCCAGGTGCA 
C C G T GG T C AC AAAGG AAAGC AAAAC T G T C AAT AAC TTTCTTCTCA 

Sequence ID 322 SEQ ID NO: 20 

T AGC AT T T GGCC T T T T AAAAC AT T T GT T TAT TTTTTTTCT GAGAAT GGC T AAC AC A 
CTTTATTGAGGTTCGAAATTAATAAAGAAAATAAAAGAAATGTATCTTCATTCATT 
CTGTATGTTAGTGTTTTAATTACCCTTAGAATATATGGATAAAAAATACTATTCTT 
TGTCTTGGAGAAGGTAAGAGTCTAGTTAGATGAATAAGGGTTATCTATGTAGAACA 
ACTAGAGAATGAGAAGAGAGCTTATGAGATTGAGTACTACGTTATGCAGTAGAGTA 
GCACGTCATCTGCTACTGAGTATGGTGTGATAACATTGTGTAACAGGAAAGTATGA 
T C AAT AT C T AC T TAAAAT T AAGGAC AAT AT T AGC AC T AC AT TGCTTTATT TTAAAG 
TAAAAATTAGAGAACTAAACACAAGCATTGTAAGTACAATAAAAGCTGATCTTTCT 
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AGTTAAGCAGAATAATACATGTTCAAGCATCTGCTAAATCATTAAATATAAGAATA 

TAGGGGTTTTCTATAATCTTATTTTCTTTGGAAGAGTACCTCATTTTCAAGANGAG 

AAGT T T C T AAT T GC C AC T T C T T T AAAAAT AAAAC AGGG T T T T AAT GT T CCC AGC AC 

AAAAAT T AAT AT C T C T T C AAAAAG TCTCTTGT GAT T AAG T T T G AAT C C C T T G T CAT 

ACTGCTTCTAATATTGACACTGACCTCCTTAGGTATTTTTCAGGGGTTATAATCTT 

TTCTTAAGGTATCTTTTTTCAAGAATTGGATACCTTGGGCTT 

Sequence ID 323 SEQ ID NO: 21 

CGCGTCGACTTTTAAAGTCATCTCTATAGGAAGGTGCTGGGCAGGGATCCCAGAGA 
AAGAAAGGGTCCAAGACTCCATTAACTGCCCTGGATGAAGGGCACTGCTACAGCAG 
CTAGTACCAGAGACTCTCCTATCTCACGGTTGAGGCAGACCCAGGATAGAATAGAG 
AAT AAAAGGAAT GC T T AT AGGAAAC AAT T T T G T AT GGAAT GC T AG AT GGCC AAGC C 
TCAGCCTTTGGTCCAGTGCAACCCTTGCCTCGCTTGTCAACAGTGAAAAATTAGTT 
TGGTTAGAAGAACCATCTGGAAACACACCAGCTTCTGCTACCTTCATGCTCATTGT 
TAAAAAAAGATTAACCAGTGTGAACATTCTGATCTGTTAATTCCAGGGACTGTTTT 
CTTTCCAATGGACTGTTTGTTGGTAGAATAACCCCCAAAAGCTCAAAGCTAAAATG 
CATCATCAGTCCTAGTCGGCAGTTCCTTAAGAATGGACTGGCGGCGTGGTTGAGCT 
GATATGGAAAAGCTGCACCTTCCTGCAGAAGATCAACTGACCTGCTATCCCACCCC 
AAATTCAACCTGAGGTATATTTCAGTGAAGCAGGTAGCTGTGCTTCTCAAAGCAGA 
G AAG C AG T T T T AAG AAC C AAAAAG G T AG AG G AAA T C T A 

Sequence ID 324 SEQ ID NO: 22 

GTTTGTTACAGGCAGAATTGGATAGATACAGCCCTACAAATGTATATGCCCTCCCC 
T GAAAAAAAT TGGAT GAAAAT C T GC AC AGC AAAGT GAAAC AC AC AGAT AAT AGGAA 
CAAAATGTAGTTCCCATGTGCCAAACAAAATAAATGAAATCTCTGCATGTTTGCAG 
CATATCTGCCTTTTGGGAATGTAATCAAGGNATAATCTTTGGCTAGTGTTATGTGC 
CTGTATTTTTTTAAAATGGTACACCAGAAAAGGACTGGCAGTCTACTTCTACCATA 
G T T AAAC T T C ACCC T C T T T AAT T T C AC AAC AT AT T C T T T GGAAGC AGGAAGAAAT G 
CTCATAAAGAGGATCAGACCTTCTTTCCCGTGAAACCAGTATTTGGCGCCATATAT 
AAGCCTGGTTAAATTGGTCATCTAAAGCTGTCAAATAAGACATTCTGTGAAAGGTA 
AACATCGAAACTGGTTATAAGTAAAACCATCAAGCCAACAACAGGGTCTTGAGATA 
ACCTTTGAAGCTTATTGTCTGGCCTGCACCAGAAGATGTCTGCATTACTCATTGCT 
AAAAAT GT GT AC AC AGAACTGC AC TAGGATT AAT TGGTTCAAGAAGAAATTT AAAC 
TTACGTTTGGGTTTCCATACAGCACTCTATTGAATACATGCATCTGAATTTAAGTT 
GCAA 



Sequence ID 325 SEQ ID NO: 23 
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GACCAGTAATGGCTTTTAAGAGTCCATTTTGTCATTGTCTCCCTAGTTAATTACAG 
GTGGGGGATCTTTTGCCTCTATTCTCTTCATATTGAAATGAATCATACTCATGTTT 
T G T G G AAC T C C T T AAAG T T G T AG C T G T C AT GAT C AG AT TTTTTTTATATTTCCTCA 
GCTTAACTCTGCTACTTGATTTACAGTGACCCATAACCTACTCATCCTTGGTTTAT 
AGTGACACATAATCTTATCTCTTTATAGAACCTTAAATTTTATCATTATTTTCGCT 
TAGAATACAGCATTTCTTTGCTTCTGTTGCTGGTTTGACTTAAGAAATAAGGCAGT 
AACTCTGATCAATCAATTATCCATAAGGAAGGGCTTTTCATGGGTTCTATTAATTT 
GTTAGTACCCTAAGTATATCTGAAAAATATGTCTATTGAGAGAAGATTTTGGCATT 
CCAGATGGTATAGTCTATATATATTTAAAGTTTTGAATTTGCTTATATATACTCAG 
CTTTCTTTTTCTAGCATTTTTGCATTTACCTGTTAATTGAAGTATACCCCCCACAT 
ATAAAAGTTCCTCTTAAAGACACTGGACTCTTTCTGGGGGGCTAAAATA 

Sequence ID ^& SEQ ID NO: 24 nt : 554 

CCCGGAATCGCGGCCCGCGTCGACAACAAACCTGCATGTTCTGCACATGTATCCAG 
G AAC T T A A A A A A A A A A A A AG A T AG TTTGTGTGTCT TAAT T G AAT AAT AG T AG AT T T 
ATAGATTAAAGATCTATGGGTTTTTAATATGGATTANAAATCTGTGGGTTTTTGAT 
ATGGATTANAAATCTGTGGGTTTTTAATATGGATTGGAAATCTGTGGGTTTTTAAT 
ATGGATTAAAAAACATCTGTGGGTTTTTAATATGGATTAAACATCTGTGGGTTTTT 
AATATGGATTAAACATCTGGGTTTTTAATATGGATTAAACATCTGTGGGTTTTTAA 
TATGGGTTAAAAATCAAAAGAAAATGAACTATTTGCTCCAGTGCAGGAAAATACAG 
GCAATACTGGATACAATTAGATGGTCAGGAGCGATAACCCGGTTGCCATTGTTTGA 
AGAAGAGAATAAGGNGCTAGCATTCCTATCCGTAGATAATTTGACAGCTAGGAAAT 
AGGGGGAGTCTTCTATGTAGTTAGTGAAGGCTAAATGAACTATTATATGC 

Sequence ID 327 SEQ ID NO: 25 

CGGCTACCGACAGAAGGACTATTTCATCGCCACCCAGGGGCCACTGGCACACACGG 
TTGAGGACTTCTGGAGGATGATCTGGGAGGGGAAGTCCCACACTATCGTGATGCTG 
ACGGAGGTGCAGGAGAGAGAGCAGGATAAATGCTACCAGTATTGGCCAACCGAGGG 
C T C AGT T AC T C AT GGAGAAAT AAC GAT T GAGAT AAAGAAT GAT AC CC T T T C AGAAG 
CCATCAGTATACGAGACTTTCTGGTCACTCTCAATCAGCCCCAGGCCCGCCAGGAG 
GAGCAGGTCCGAGTAGTGCGCCAGTTTCACTTCCACGGCTGGCCTGAGATCGGGAT 
TCCCGCCGAGGGCAAAGGCATGATTGACCTCATCGCAGCCGTGCAGAAGCANCAGC 
AGCAGACAGGCAACCACCCCATCACCGTGCACTGCAGTGCCGGAGCTGGGCGAACA 
GGTACATTCATAGCCCTCAGCAACATTTTGGAGCGAGTAAAAGCCGAGGGACTTTT 
AN AT G T AT T T C AAGC T G T G AAG AG T T T AC G AC T T C AG AG AC C AC AT AT GG T GC AAC 
CCTGGAACAGTATGAAATGTGCTACAAAGTGGTACAAGATTTATTGATATATTTCT 
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GATTATGCTAATTTCAATGAAGATCCTGCCTTAAATATTTTTTAATTTAATGGCAN 
AT 

Sequence ID 328 SEQ ID NO: 26 

C A AG AC TCCATCTC A A A A A A A A A A A A A A A TCTACAGTGCTGAGTATAT A A A A T T A T 
TAACACATTTCACAACAATATGTGTTTGTGGAGTTAAATATTTTTTGTCTTTAAAA 
CAGGTAATTTTAGTGCATACTTAATTTGATGATTAAATATGGTAGAATTAAGCATT 
TTAAATGTTAATGTTTGTTACATTGTTCAAGAAATAAGTAGAAATATATTCCTTTG 
TTTTTTATTTAAATTTTTGTTCCTCTGTAAACTAAAAGAACACGAAGTAATTGGTC 
ACAATTACTGGTGTTTAACTGCCAAATATGGGTAAATAAGGGAAAATTTTGTTTAA 
TATTTAGTCCTTCTGAGATGGCTTGAATATTTGAATTTTGTTGTACGTCTATACTG 
GG T AG T C AC AAG T C T T AT AAAC AC T T T AG AGG AAAG AT GG AT T T C AG T C T G T AT T T 
T T AAAC AT CAT T T AT T TTAAAT CTGGTGCT GAAAAAT AAGAAAAAAAT T AAAC T GC 
ATTCTGCTGTTCTTCTTTANAAGCATTCCTGCGTAAATACTGCTGTAATACTGTCA 
TGCAAAGTGTATCCTTTCTTGTCGTATCCTTTTTGGGGCAGTGGTTTTT 

Sequence ID 330 SEQ ID NO: 27 

G C G G G A A TCGCGGCCCGCGTCGACCTC AAAG GAG AAA A A A A AC C T T G T A A A AAA AG 
C AAAAAT G AC AAC AGAAAAAC AAT CTTATTCC GAG CAT T CC AGT AAC TTTTTTGTG 
TATGTACTTAGCTGTACTATAAGTAGTTGGTTTGTATGAGATGGTTAAAAAGGCCA 
AAGAT AAAAGG T TTCTTTTTTTTTCCTTTTT T GT C T AT GAAG TTGCTGTT TAT T T T 
TTTTGGCCTGTTTGATGTATGTGTGAAACAATGTTGTCCAACAATAAACAGGAATT 
TTATTTTGCTGAGTTGTTCT A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A 
AAAAAAAAAAAAAAT T T TAAAAT T T T T A A A A T A A A A CCCTTGGTTAT 

Sequence ID 331 SEQ ID NO: 28 

GCCGCGTCGACCTGCATGAGCCACAGTTTCTTGACTGGAGGCCATCAACCCTCTTG 
GTTGAGGCCTTGTTCTGAGCCCTGACATGTGCTTGGGCACTGGTGGGCCTGGGCTT 
CTGAGGTGGCCTCCTGCCCTGATCAGGGACCCTCCCCGCTTTCCTGGGCCTCTCAG 
TTGAACAAAGCAGCAAAACAAAGGCAGTTTTATATGAAAGATTANAAGCCTGGAAT 
AATCAGGCTTTTTAAATGATGTAATTCCCACTGTAATAGCATAGGGATTTTGGAAG 
CAGCTGCTGGTGGCTTGGGACATCANTGGGGCCAAGGGTTCTCTGTCCCTGGTTCA 
ACTGTGATTTGGCTTTCCCGTGTCTTTCCTGGTGATGCCTTGTTTGGGGTTCTGTG 
GGTTTGGGTGGGAAGAGGGCCATCTGCCTGAATGTAACCTGCTAGCTCTCCGAAGC 
CCTGCGGGCCTGGCTTGTGTGAGCGTGTGGACAGTGGTGGCCGCGCTGTGCCTGCT 
CGTGTTGCCTACATGTCCCTGGCTTGTTGAGGCGCTGCTTCAACCTGCACCCCTCC 
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TTGTCTCATAGATGCTCCTTTTGACCTTTTCAAAATTAATATGGATGGGAAAGCTC 
CTATGCCTTTTGGCTTCCTGGTAGAAGGCGGGATGCCCAAGGGTCTGCCTGGGTGT 
GGATTGGATGCTTGGGGTGTGGGGGTTGGAAACTGTCTTGTGGCCCACTTGGGCCC 
C 

ScqucncG ID 335 SEQ ID NO: 29 

CCCGCGTCGACTTTTAAAGTCATCTCTATAGGAAGGTGCTGGGCAGGGATCCCAGA 
GAAAGAAAGGGTCCAAGACTCCATTAACTGCCCTGGATGAAGGGCACTGCTACAGC 
AGCTAGTACCAGAGACTCTCCTATCTCACGGTTGAGGCAGACCCAGGATAGAATAG 
AGAATAAAAGGAATGCTTATAGGAAACAATTTTGTATGGAATGCTAGATGGCCAAG 
CCTCAGCCTTTGGTCCAGTGCAACCCTTGCCTCGCTTGTCAACAGTGAAAAATTAG 
T T T GG T T AGAAGAAC CAT C T GGAAAC AC AC C AGC T T C T GC T ACC T T C AT GC T CAT T 
G T T AAAAAAAG AT T AAC C AG T G T G AAC AT T C T GAT C T G T T AA T T C C AG G G AC T G T T 
TTCTTTCCAATGGACTGTTTGTTGGTAGAATAACCCCCAAAAGCTCAAAGCTAAAA 
TGCATCATCAGTCCTAGTCGGCAGTTCCTTAAGAATGGACTGGCGGCGTGGGTGAG 
CTGATTTGGAAAACTGCCCTTCTGCAAAAAACACTGGCCTGCTTTCCA 

Sequence ID 337 SEQ ID NO: 30 

C AAG AC T C CAT C T C A A A A A A A A A A A A A A A T C T AC AG T GC T G AG T AT AT AAAAT TAT 
T AAC AC AT T T C AC AAC AAT AT G T G T T T G T GG AG T T AAAT AT TTTTTGTCTT T AAAA 
C AGGT AAT T T T AGT GC AT AC T T AAT T T GAT GAT T AAAT AT GGT AGAAT T AAGC AT T 
TTAAATGTTAATGTTTGTTACATTGTTCAAGAAATAAGTAGAAATATATTCCTTTG 
TTTTTTATTTAAATTTTTGTTCCTCTGTAAACTAAAAGAACACGAAGTAATTGGTC 
ACAATTACTGGTGTTTAACTGCCAAATATGGGTAAATAAGGGAAAATTTTGTTTAA 
TATTTAGTCCTTCTGAGATGGCTTGAATATTTGAATTTTGTTGTACGTCTATACTG 
GGTAGTCACAAGTCTTATAAACACTTTAGAGGAAAGATGGATTTCAGTCTGTATTT 
T T AAAC AT CAT T T AT T T T AAAT CTGGTGCT GAAAAAT AAGAAAAAAAT TAAAC T GC 
ATTCTGCTGTTCTTCTTTAGAAGCATTCCTGCGTAAATACTGCTGTAATACTGTCA 
TGCAAAGTGTATCCTTTCTTGTCGTATCCTTTTTGGGGCAGTGGTT 

Sequence ID 338 SEQ ID NO: 31 

CTGGACTGCATGACCAGATCTGATGGGTGAGACTCAGGTGGCATGGAAGAGCCGAA 
AGAGGATACCATATGTGGGTGCCGGGGGGGATAGGTGAGAAGTACTAGAAGGCGGA 
ATGGAAGGACACTTCTGCTCAGCTCTGTGACACGGGCAGGGACCCTGCAGGGCTCA 
GGTCCTTTAACACAGCAGCTTCATTCTAACACCAGCAGCGTTGGAACACACGTACA 
AGTATGCAGACTAAGCTCTTGCTTGGCTGATACGGCTTTTTGGGTTTTTAGAGAAC 
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ATGCATATATGTTCTCATTCATGGTACATGAACTCAGAAGCCTTACTGCCTATTTT 
TGTTAATACTTCTGGGCAAACATTACCACTTACAACTCACACCAGTTAGAAATCAT 
T T GT AAAAT GT T AT T T AAT AAAGCC AAAGAAC T AAAT CAT AT T TAT T T T CC AAGGN 
TTTCTAAGATCTCTGAAACTAATGAGGTTTTTTAAATCCCCATTAAGTACTCATCA 
CTGCTAGTAAAAGCAGTTGTCTTTACCTTTAATTCCAGTGAGTCCCCTTAAATTTA 
TTTTTTAT TAT CTTTGGCTAC ATT GCCTTAGAC AAAAT GTGGTCACCCTAATTTAA 
NGGATAAAATTCACATCCTCACAGATTTCTTATTAAGAGGGTCTAANCCTTGAATA 
ATCANCAGTGGAAATGGAAGTCTTCTTTACTGGNTTTNATCCTTTCCCTTTTTTAT 
CCCATG 

Sequence ID 339 SEQ ID NO: 32 

TTTTTTTTTAAATAAAGCTGTCGGCACTCAAGGGTAATTTCATATCAGTGTGNTCT 
ACAAGCTGGGGGAAAATGAGTTCTAATTGTCANAGCTACCAAATCCTTCACCTTTA 
GCATAAAGGTTTAAAGATATCACAAAGATGCCAAGTGATTAATAATGTTTTAAACC 
ACCCCTTTTTCTGTCTGAAAAAACAACTAAAACAATATTACAACAGTATAGTTACA 
GAAGGGTTCTATTTTCATATGTTTTATGCACACTGTGCCTCAAAGGTACTATTTAA 
ATATATATACTTTTGAGGGGGTGGCTAATGCAGAAACACCCAAGACCTAAGGAAGA 
TACAACCCCATTTCTAGGTGTGAGGTCTAAATGCTTCACACACCCACTTGTGACCT 
T T T T T CAT G AAG AAT C AT AAC AC T G T GC AG T G AG AAAC AG T GGC AAAGC AAT AC T G 
AAAGC AT T T T AAAT TAT T T AC T AGGT T AAAAGGGT GAAC T GAT AC T T T AAAT AC AT 
CAAATTTCATCAT 

Sequence ID 360 SEQ ID NO: 33 

GCAAGTGAGAGCCGGACGGGCACTGGGCGACTCTGTGCCTCGCTGAGGAAAAATAA 
CT AAAC AT GGGCAAAGGAGATCCTAAGAAGCCGAGAGGC AAAAT GT CAT CAT AT GC 
ATTTTTTGTGCAAACTTGTCGGGAGGAGCATAAGAAGAAGCACCCAGATGCTTCAG 
T C AAC T T C T C AG AG T T T T C T AAG AAG T G C T C AG AG AG G T G G AAG AC CAT G T C T G C T 
A AAG AG AA AG G AAAA T T T G AAG AT A T G G C AAAAG C G G AC AAG GCCCGTTAT G AAAG 
AG AAAT GAAAACC T AT AT CCC T CCC AAAGGGGAGAC AAAAAAGAAGT T C A AG GAT C 
CCAATGCACCCAAGAGGCCTCCTTCGGCCTTCTTCCTCTTCTGCTCTGAGTATCGC 
CCAAAAATCAAAGGAGAACATCCTGGCCTGTCCATTGGTGATGTTGCGAAGAAACT 
GGGAGAGATGTGGAATAACACTGCTGCAGATGACAAGCAGCCTTATGAAAAGAAGG 
C T GC G AAGC T GAAGG AAAAAT AC G AAAAGG T A 



Sequence ID — 361 SEQ ID NO: 34 nt : 622 

CTGTNATNGAATCTGCTTGTNACTNAAATGCTAAACTCAATTCTGTAATTCAATAG 
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GTGCACCTNTCTGAGAAACATANNAGACAATGAGGAAAAGGATTCANCATTCCGTG 
GAATTTGTACCATGATCAGTGTGAATCCCANTGGCGTAATCCAAGTAAGATGTTCA 
CAAAGAT TTGTTTT TAATGTCTAAT TAATAAAAT T T T AAAGGAAG AAAC AT TCTAA 
T AC T T T AAT T AT AAAAAG T T A AC TAT T T T C AAAGG T AT C AAAAT AC AGT T AAAC C T 
TTAAAATGTATATTTCTTAATATCTTGAAATTGTAATGCCTTTTTTTTTTCCTAAA 
TTTTTTTTGTCATGAAATGAGATAGTAACAGCAGATTGGGACAACAAGGTTATATT 
CTTGTCTTGAATCAGGCCATGGCTTCTTTCATCCAAATTTCAGACCTCATTTATTT 
ACTTTGTCCCTGCCTCCCATCCCTGGATATCANGTTTGTGGATATCTACAGTTAAT 
AGAGTGACCAAAT AGT AGGAATACTGTCTCTCT AT TCTGAAT AAAAT ACT TTGAAT 
CAGATTTAGAAATAATGAAT AAAAT ACAAATCACCATTGAAATTGCTCT AAT TTTG 
AGAGCT 

Sequence ID — 363 SEQ ID NO: 35 nt : 628 

ATCACNTGAGGCAAGAGTTTGAGCCAGCCTAGCTAACATGGTGAAACCCCATCTCT 
ACAAAAATATAAAAATTAGCCTGGGTGGTGATGGGCACCTGTAACCCCAGCTACTC 
GGGAGGCTGAGGTAGGAGAATCACTTGAACCCGGGAGATGGAGGTTGCAGTGAGCC 
AAGATCGTGCCACTGCACTCCAGCCTGTGTGACAGAACAAGACTCTGTCTCAAAAA 
AAAAT AAT AAT AAT AAT AAT AAT AAAAAGGAATAAC AT AGCTAGGAATAAATTTAA 
T C AAAG AGG T G AAAG AC T TAT AC AC T T AAAAC T AC AAAAAAAAAAT C AC T GAAGGA 
AT TATAGACCC AAA T A A A A A T AAA T A A A A AG AC A T TCTGTGTT T TAG GG AAAG AAG 
AC T T AAT AT T G T T AAGAT GT C AAT AC T ACC C AAAG T GAT C T AC AG AT T C AAC AT AA 
TCCCTATC A A A A T T C C A A CAGCCTACTTTGTAG AAA T G G A A A A G C C A A T T T T C AAA 
TTCAGATGGAATTGCGAGGGGT TCTGAAT AACAAAAACAATCTTGGGGAAAAAAAA 
C AAAAAAC AAAG T C AAAG AAC T C AC AC TTCTCTATT T AT AAAT T T AC T AC AAAG T T 
AT AGT AAT C AAA 

Sequence ID — 364 SEQ ID NO: 36 nt : 528 

TGAACATCCAGCCATGTCATTTCTTCCATTCCTGCCCTGGAGTAAAGTAGATTTAC 

TGAGCTGATGACTTGTGTGCATTTGTACATTGCAACCTTAGCTTACCTCTTGAAGC 

ATGTAGAGCATTCATCACCCACCATTCATTCACTGCCTACTCCCACCACAGCTGTT 

TCGTGGTCTGTCTGCTCCCTGTGCCACCCCCACCCCATCAGGTGGGCCTTTTGCAA 

GTGATGAAGTCACCTGTGGGGGAAGAGCTTTCCTTTCCTCTCCTCAACTCAGAAGG 

CCTCTTCCTCTTGCTCAAGAGGGTGCTGCTGCTTTCTGCCTCCTTCCCCGGCCGGC 

CTCCATCCCAGTTCACCTTTTCAGAAATGGCCCCTCAGTCAACTCTTCCCTTTTCT 

CCTGGCTTTTTATTTCTCCCAGTCTCTTAAGAGTATCCTTAGCTTTAAAAACAATA 

ACACAGAGGATGGGTGCAGTGGCTCATGCCTGTAATCCCAGCACTTTGGAGCCTGG 

GGCGGGCGGATCACTTGAGGNCA 
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Sequence ID 365 SEQ ID NO: 37 

GTCCCGGAATCGCGGCCGCGTCGACCTTTTCTATGCCTGCTATATAAACAGTACCT 
TGCAAGATGTCCTGTCTGATATCCACAAAGGGGTATTGTCAACCCCAAGTTCAGAC 
AGCTTTGTATTCTTCTGTCCCTGGATACATGAATTACTGCCATCTTTACACAGCGC 
CCTAAAATACCAACGCGAAGTTACCTGCTCAGCTTGAAGCTGCGCTGTACCCTGGA 
ACCAGCACTTCTGCTGAATGACTCAGGATGAAGCCTCGACTTCTCCTTCCCATCCC 
ATGCCCAGACCCCAGTGGCTCCTTTCCCAATCTGATCCAGTGACTTTAAGTCCAGC 
TGTTGCAACCTGGGCATGAGGAGGAGTGCAAGATGGCTTTGTCCTACCTGGAAAGA 
GGCTTTCTGGA 

Sequence ID 366 SEQ ID NO: 38 

CACCATTTACACACAGTGGGTCCTTGAATAGCATCGTTTTATTCAATGTCATTTTG 
TTATAACATTGAGAAAAAAATTGATTCCCGGCTGGGGCCACTGTCTGTGCACCGT 

Sequence ID 3-&& SEQ ID NO: 39 nt : 32 9 

GAAAGAT C T AAAAT CGAC ACC C T AAC AT C AC AAT T A A A AG A AC T AG AG A AG C A AG A 
GCAAAT T C AAAAGC T AGC AGAAGGC AAG AAAT AAC T AAGAT C AGAGC AGAGC T GAA 
AG AG AT AG AG AC AC AAAAAAC CAT T C AAAAAAAAAC AAT G AAT C C AGG AG T T T T T T 
T T T T AAAAAGAT C AAC AG AAT T G AC AG AC T GC T AGC AAG AC T AAT AAAGAAGAGAG 
AAGC AT C AAAT AGAC T C AAT AAAAAAT G AT AAAGGGGAT AT C AC C AC C AAT C CC AC 
AGAAAT AC AAAC T ACC AT C AGAGAAC AC T AT AAAC ACC T C TAT GCAAAT 

Sequence ID 369 SEQ ID NO: 40 

GAAAGAT CT AAAAT CGAC ACCCT AAC AT C AC AAT TAAAAGAACTAGAGAAGCAAGA 
GCAAAT T C AAAAGC T AGC AGAAGGC AAGAAAT AAC T AAGAT C AGAGC AGAGC T GAA 
AG AG AT AG AG AC AC AAAAAAC CAT T C AAAAAAAAAC AAT G AAT C C AGG AG T T T T T T 
T T T T AAAAAGAT C AAC AG AAT T G AC AGAC T GC TAG C AAG AC T AAT AAAGAAGAGAG 
AAGC AT C AAAT AGAC T C AAT AAAAAAT GAT AAAGGGGAT AT C AC C AC C AAT C CC AC 
AGAAAT ACAAACTACCATCAGAGAACACT AT AAACACCTCTATGCAAATAAACT AG 
AAAAT 

Sequence ID 370 SEQ ID NO: 41 

GAAAGAT CT AAAAT CGAC ACCCT AAC AT C AC AAT TAAAAGAACTAGAGAAGCAAGA 
GCAAAT T C AAAAGC T AGC AGAAGGC AAG AAAT AAC T AAGAT C AGAGC AGAGC T GAA 
AG AG AT AG AG AC AC AAAAAAC CAT T C AAAAAAAAAC AAT G AAT C C AGG AG T T T T T T 



T T T T AAAAAG AT C AAC A 
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Sequence ID 371 SEQ ID NO: 42 

GCCCGGAATCGCGGCCGCGTCGACGTAAGCTCGGCTGAATCCACGGTTCAAGAACA 

GGAAAGAAGGCCAAGGCATAGGGAGTGGGGCAGTTGGGTGAATATTAGTACCTTTC 

CCTCAGNTNCATTAATTACCCCTGCCTACTCTGCACAAAAGGATNTAACAACAGTT 

TCCTTTTTAATGGCCAGGTACAGCTGCTTATATGGANGGGCATTTNTNAATGATAT 

CCTTNATCACTGTCTTAATCATCACATNCTTAAAACAATCACTTTATTGTGTTAAG 

GAAGATAAAAATGGCTGGGTTCAATTTCCGTTCTGGAAGAAATCGANTNAAAAGGT 

AACCATTTAATAATGCANAGGGCANTTTCACTGCAGACCCTAATACTGGAAATTTT 

T A A A A AC A A A T GAAAAAC TTCTACTTTTTCTT CTAAGCT TACT T AAC C AC CCAAAT 

T T T C C AGC C AC AT AT C T T CC T AGT C T AC AAC T GC C T T T AAC T T T AAGAGAT GC T C A 

AAAAAAT G T AAAT T C T C AAA T AC AT T C T T AT T AC AAT T AC T GC T AAC C T 

Sequence ID 373 SEQ ID NO: 43 

CCAGTGTGCTGGGATTACAGGCATGAGCCCTGCACCCAGCCTCTTAAACTGATCAT 
ATGATATTGGTTCTCAACCAAGGGTGACTTTGCCCCCAGAGGATACTTGGCAATGT 
CTGGAGATACTCAGTTGTCATGACTTGGACAGGTGCTACTGTCACCCAGTGGGTAG 
AGGTCAGGGATGGTGCTAAACATAGGACAGCTGTCAAGAGAAAAGAATGTACCCAG 
CCCC AAAT GTC AGT AGGGCTGAGGTTGAGAAACCC AGC TGT AGC TGACGTGTGAAG 
GACAGACTGGCCTGGAAGTGTGTTTTCTGCCCCTTTCCACCCCTGCATATTAGTTA 
A G G C C A A AG G A A A A A AG G A A T G C AG G A A A TGCCCGTT A A A A A T C T T C A A A AC A A T A 
TAAAATGATCAATTCCACTAAAACCCTTTACACATTTAAGTATAAAGGTATTGGTA 
GGAAAATTTGTTATTCACTGCTTTTCTCAGTGTCATGAAATAATTATTTCTGCTGT 
CAGTTT 

Sequence ID 374 SEQ ID NO: 44 

A A A A A A A A A A T C AC T GAAGGAAT T AT AG AC C C AAAT AAAAAT AAAT AAAAAG AC AT 
TCTGTGTTT T AGGGAAAG AAG AC T T AAT AT TGT T AAGAT GT C AAT AC T AC C C AAAG 
T GAT C T AC AG AT T C AAC AT AAT C C C T AT C AAAAT T C C AAC AGC C T AC T T T G T AG AA 
ATGGAAAAGCCAATTTTCAAATTCAGATGGAATTGCGAGGGGTTCTGAATAACAAA 
AAC AAT C T T GGGGAAAAAAAAC AAAAAAC AAAG T C AAAG AAC T C AC AC T T C T C T AT 
TTATAATTTACTACAAAGTTATAGTAATCAAAGTCGACGCGGCCGCGATTCCGGG 

Sequence ID 378 SEQ ID NO: 45 

CGACTGCGGCTCTTCCTCGGGCAGCGGAAGCGGCGCGGCGGTCGGAGAAGTGGCCT 
AAAACTTCGGCGTTGGGTGAAAGAAAATGGCCCGAACCAAGCAGACTGCTCGTAAG 
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TCCACCGGTGGGAAAGCCCCCCGCAAACAGCTGGCCACGAAAGCCGCCAGGAAAAG 
CGCTCCCTCTACCGGCGGGGTGAAGAAGCCTCATCGCTACAGGCCCGGGACCGTGG 
CGCTTCGAGAGATTCGTCGTTATCAGAAGTCGACCGAGCTGCTCATCCGGAAGCTG 
CCCTTCCAGAGGTTGGTGAGGGANATCGCCCAGG 



Sequence ID 380 SEQ ID NO: 46 

GCAATTTAATTTTTAATAACAAAGATACTGTATTTTAACATGGTGAAATATACTTG 
GCTAAGTCCAGATTAAAAAAAAAAAGTATCTAGCCCAACAGTACAATTATACAGCT 
TTGTACAGAACATTCCATAGATCAACAGAAAATACATTTGAGCGCAAAAATAAAAA 
AT AT T T AAGGAGAAT C T C T AAGC AGC AT T T T AT T T C T GC AAAAGAC AT AT C T T GT C 
T GAT T AAAT AT C T AC AAG TGCTTTTCCTTT C AAAAAT AC AT AT AT T C T T AAT AGAC 
T AAG T CAT T AAC AAT GAC C T GGT AAT T C T T T C AC T T C AAT T T GAAT GAT T T AT AAG 
CTAAATCTTCAACCACAAAAAGGTTTTTATTTGTATTAAGATGTTACCACTTTTGA 
C AAAAAGC T T AAAAT AT T T T AT AT T T C AAAGGAAAAT T AGC AAC AT AAC T T T AC AA 
TATATTCTATGATATTTTGATTGTGAGGGCTACTCTATTTAAAACTGATGATCTCT 
GTTGTGTTGCTCAGATGCAGGAAAGCAGCAAAA 

Sequence ID — 381 SEQ ID NO: 47 nt : 53 4 

GACTTANATCTAAATGGACCACATTCTCTACTTAAAAAAATGCTATTAACCATGTG 
AT C T T C T C AG T CAT G AGG T AAT C T GG T G AC T AC C C T T C C T C AAAGC C AG T T GGG AT 
ATTCTTTGAATAGAGTAAAACAGTGTTTCTAGGCTGGGAGACACCAGACATAGTTG 
AGGACAGAGGTGCTAGAAAATAGGAAGTTTAAAAGCATGTGCGGTGATGCTCAGAG 
GAGGTAAACCCCACCCTCATGCTCATAGCTTCCAATCATTTTCTCTAGTTCTTAAC 
TCTTAAATGTGAGAAATGCTTGAAGATTCTAGTCATCTGAAGAAAGTCTCTTTATT 
AAAGATTTTCATAAAAGAGACCAAAGCAGACAAACAGAAAAAGACATCTTGGGGAA 
AAAAACAAGGATAATGGGAAGAGAAGGAAAGTTTTAAAAATTATCAATATCCTCAG 
GGGGAC AAAAT AT T AT AT CC T AT AAAGAC AGAT T T T TAT T T T T T AAAAAAAT AGAA 
AGC AAAAC AAGC T C C T AAAAAT AAAG T T T G 

Sequence ID 3-&^- SEQ ID NO: 48 nt : 44 4 

GTTAAGGAAGTCAGCACTTACATTAAGAAAATTGGCTACAACCCCGACACAGTAGC 
ATTTGTGCCAATTTCTGGTTGGAATGGTGACAACATGCTGGAGCCAAGTGCTAACA 
TGCCTTGGTTCAAGGGATGGAAAGTCACCCGTAAGGATGGCAATGCCAGTGGAACC 
ACGCTGCTTGAGGCTCTGGACTGCATCCTACCACCAACTCGTCCAACTGACAAGCC 
CTTGCGCCTGCCTCTCCAGGATGTCTACAAAATTGGTGGTATTGGTACTGTTCCTG 
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TTGGCCGAGTGGAGACTGGTGTTCTCAAACCCGGTATGGTGGTCACCTTTGCTCCA 
GTCAACGTTACAACGGAAGTAAAATCTGTCGAAATGCACCATGAAGCTTTGAGTGA 
AGCTTTTCCTGGGGACAATGTGGGCTTCAATGTCAAGAATGTGTCTGTCAAG 

Sequence ID 3-8-^ SEQ ID NO: 49 nt : 5 66 

CTTTGAAGAACTTTGCCAAATACTTTCTTACCAATCTCATGAGGAGAGGGAACATG 
CTGAGAAACTGATGAAGCTGCAGAACCAACGAGGTGGCCGAATCTTCCTTCAGGAT 
ATCAAGAAACCAGACTGTGATGACTGGGAGAGCGGGCTGAATGCAATGGAGTGTGC 
ATTACATTTGGAAAAAAATGTGAATCAGTCACTACTGGAACTGCACAAACTGGCCA 
CTGACAAAAATGACCCCCATTTGTGTGACTTCATTGAGACACATTACCTGAATGAG 
C AGG T GAAAGC C AT C AAAGAAT T GGGT G AC C ACGT GAC C AAC T T GCGC AAGAT GGG 
AGCGCCCGAATCTGGCTTGGCGGAATATCTCTTTGACAAGCACACCCTGGGAGACA 
GT GAT AAT GAAAGC TAAGCCTCGGGCTAATTTCCCC AT AGCCGTGGGGT GAC TTCC 
CTGGTCACCAAGGCAGTGCATGCATGTTGGGGTTTCCTTTACCTTTTCTATAAGTT 
GTACCAAAACATCCACTTAAGTTCTTTGATTTGTCCATTCCTTCAAATAAAGAAAT 
TTGGTA 

Sequence ID 384 SEQ ID NO: 50 

TTTTGGGGTTTATATATAAGCCTGGTTCTTGCTGAAACTGCTTATGTTGATAACCA 
G T TAG T GAGT T CC T C T C T AT T GAC T T GC T GGGAAG T T T AT AGAGAC AT T T T T TAT G 
CATTCAGAGATTTCAGTACAAATCTTGAAAAAGGGACATTTAGGCCGGGCGCGGTG 
GCTCACATCTGTAACCCTAGCACTCTGGGAGGCTGAGGTGGGTGGATCATGAAGTC 
AAGAGATAGAGACCATCCTGGCAAAAATTAGCTGGGCGTGGTGGGGTGCGCCCGTA 
GTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATTGCTTGAGCCCGGGAGGCGGAG 
GTTTCATTGAGCCGAGATAGTGCCACTGCACTCCAGCCTGGACAACAGAGCGAGAC 
TGTGTCTT 

Sequence ID 386 SEQ ID NO: 51 

CTAAGGGTTTAAAGATGGAAAGAGGCATTGATGAACAGCTGGGGAAGGAGTAGTTT 
GAGGTAGATGTGCAGATGGAATGAAGAGAAGGTCTCAAGAAGAGGGTGGAGCCAAA 
GAGGGCTGCAGATTTAGAAGGCTAAAGTCTTTAGATGGCTTTGGATAGCCTGTTGT 
ATCTTGGACCATGCAGGTTACAGTGGAGCATGGAGTGGGGACAGAAGTGGAGGAAG 
GAACCAGGGAACATGGAGTGAGAAGCTAAAGGAAAGTGATGCAGTAGATACATGGC 
TCTAAAGTACTCAGGACTTTCAGAGGCTTAAACATAGGGTGACCAACTATCCCACT 
ATGCCTGATACTAAGGGCATTCCCTGGATGTGGACCTTTCATTCCCCAAATTAGGA 
AAG T C T T G G G C A T AC C AAG AC AAG T T G G C C AC C C T AC T C AAAAG T A T G T AAG C T AA 
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CATATCTGTTCTCTAAGAGGTTAAAGCTGGATGGGGATACCAGATGTATGTACGTG 
ATGCAGTTAAACAGCAATACAAGGGGGCAAGTCTACCTGATCGGCCAATTCAATGG 
GA 

Sequence ID 387 SEQ ID NO: 52 

GAAGCCAAACCAAAGGAGCTTCTACTTCATGATGCCATTTATGTAAAGTTCAGGCA 
GAGAAAATCAGTGGTTTAAGAAGTTAGAATAATGATTATCTTTGGAGGGATTGCAA 
CTGGAAGAAGTCATGATTGGGATTTCTGGGTCCTAATAGTGCTCTGTGTCTTGATC 
TGAGTGCCGACTACATGAGTGGTTAGGTTTGCAAAATTCATTGAGTTATGCACTTA 
ATGGTGTTGTCTTATTAGAGCTGATGGAGGAGAGAGGGCTTCAATTTGCACAACTG 
AGTAATCAGCTAGGCCCAGTCACTAGGTGAACAACTTACTGCTCCAATCAGCCTTA 
GAGCAGGAATCAAACTCATGTCTCAGAAAAGTTATTAATTCAGCTTGTCTTGGGAC 
T T C C T T C AGAG T C AC T C T T GAAT AGC T G AAAT AGT AAAT GT T AAAT C T GT GGAT GC 
AAGTGTGTAAATTATTTTAGTCATCAGCTCTAATAAGATGGCCTTTGGGGAAATGA 
GTATAAGGTCACGAAAATGAAATGGCAAGAAGGAGGTCTACTATTTCTTCTGTAAT 
ACTGATTTTTACCCCATCAGGGTCAGTCCCCAGAGGTTGTAAATGTGAAGCTTG-T 
CTTTTTCTTTAATAA 

Sequence ID 388 SEQ ID NO: 53 

CTTTGGACACTAG G A A A A A AC CTTGTAGAGAGAGT A A A A A A T T T A AC AC C C A T AG T 
AG G C C T AAAAG C AG C C AC C AA T T AAG AA AG C G T T C AAG C T C AAC AC C C AC T AC C T A 
AAAAATCCCAAACATATAACTGAACTCCTCACACCCAATTGGACCAATCTATCACC 
CTATAGAAGAACTAATGTTAGTATAAGTAACATGAAAACATTCTCCTCCGCATAAG 
CCTGCGTCAGATTAAAACACTGAACTGACAATTAACAGCCCAATATCTACAATCAA 
CCAACAAGTCATTATTACCCTCACTGTCAACCCAACACAGGCATGCTCATAAGGAA 
AG G T T A A A A A AAG T AAAAG G A AC T C G G C AAA TCTTACCCCGCCTGTTTACC AAA A A 
CATCACCTCTAGCATCACCAGTATTAGAGGCACCGCCTGCCCAGTGACACATGTTT 
AACGGCCGCGGTACCCTAACCGTGCAAAGGTAGCATAATCACTTGTTCCTTAATTA 
GGGACCTGTATGAATGGCTCCACGAGGGTTCAGCTGTCTCTTACTTTTAACCAGTG 
AAATTGACCTGCCCGTGAAGAGGCGGGCATAACACAGCAAGACGAGAAGACCCTAT 
GGAGCTTTAATTTATTAATGCAAACAGTCCTAACAAACCCCAGGTCCTAAACTCCA 
AACCTGCATTAAA 

Sequence ID 389 SEQ ID NO: 54 

CGACCCGGAATTCGCGGCCGCGTCGACTGAGTTCTTGACAAGAGTGTTTTTCCCTT 
CCCGTCACAGAGTGGGCCCAACGACCTACGGCACTTTGACCCCGAGTTTACCGAAG 
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AGCCTGTCCCCAACTCCATTGGCAAGTCCCCTGACAGCGTCCTCGTCACAGCCAGC 
GTCAAGGAAGCTGCCGAGGCTTTCCTAGGCTTTTCCTATGCGCCTCCCACGGACTC 
TTTCCTCTGAACCCTGTTAGGGCTTGGTTTTAAAGGATTTTATGTGTGTTTCCGAA 
T GT T T TAG T T AGC C T T T T GGT GGAGCC GCC AGC T G AC AGGAC AT C T T AC AAG AGAA 
TTTGCACATCTCTGGAAGCTTAGCAATCTTATTGCACACTGTTCGCTGGAAGCTTT 
TTGAAGAGCACATTCTCCTCAGTGAGCTCATGAGGTTTTCATTTTTATTCTTCCTT 
CCAACGTGGTGCTATCTCTGAAACGAGCGTTAGAGTGCCGCCTTAGACGGAGGCAG 
GAGTTTCGTTAGAAAGCGGACGCTGTTCT 

Sequence ID — 390 SEQ ID NO: 55 nt : 523 

G AAT C C C T AG AAAAAG AG AAT T C C C AAC T T GAT G AGG AAAAC T T AG AAC T GC G AAG 
GAATGTAGAATCTTTGAAGTGTGCAAGCATGAAAATGGCTCAGCTACAGCTAGAAA 
AC AAAGAAC T GGAAAGT G AAAAAG AGC AAC T T AAG AAG GGT T T GG AGC T CC T GAAA 
GCATCTTTCAAGAAAACAGAACGCTTAGAAGTTAGCTACCAGGGTTTAGATATAGA 
AAAT CAAAGAC T GC AAAAAAC T T T AGAGAAC AGC AAT AAAAAAAT CC AGC AAT TAG 
AGAGTGAACTACAAGACTTAGAGATGGAAAATCAAACATTGCAGAAAAACCTAGAA 
G AAC T AAAAAT AT C T AGC AAAAGAC TAG AAC AGC T GGAAAAAGAAAAT AAAT CAT T 
AGAGC AAGAGAC T T C T C AAC T GGAAAAGGAT AAGAAAC AAT T G GAG AAG G AAAA T A 
AG AG AC T C C G AC AN C AAG C AG AAAT T AAAG AT C C AC AT T T G AAG AAAAT AAT G T G A 
AG AT T GG AAAT T T GG AAAA 

Sequence ID 3-»jr SEQ ID NO: 56 nt : 5 66 

C T T T G AAG AAC T T T GC C AAAT AC T T T C T T AC C AAT C T C AT G AGG AG AGGG AAC AT G 
CTGAGAAACTGATGAAGCTGCAGAACCAACGAGGTGGCCGAATCTTCCTTCAGGAT 
ATCAAGAAACCAGACTGTGATGACTGGGAGAGCGGGCTGAATGCAATGGAGTGTGC 
ATTACATTTGGAAAAAAATGTGAATCAGTCACTACTGGAACTGCACAAACTGGCCA 
C T G AC AAAAAT G AC CCCCATTTGTGT G AC T T CAT T GAG AC AC AT T AC C T G AAT GAG 
C AGG T GAAAGC C AT C AAAG AAT T GGGT G AC C ACGT GAC C AAC T T GCGC AAGAT GGG 
AGCGCCCGAATCTGGCTTGGCGGAATATCTCTTTGACAAGCACACCCTGGGAGACA 
GT GAT AAT GAAAGC TAAGCCTCGGGCT AAT TTCCCC AT AGCCGTGGGGT GAC TTCC 
CTGGTCACCAAGGCAGTGCATGCATGTTGGGGTTTCCTTTACCTTTTCTATAAGTT 
GTACCAAAACATCCACTTAAGTTCTTTGATTTGTCCATTCCTTCAAATAAAGAAAT 
TTGGTA 



Sequence ID 394 SEQ ID NO: 57 

GACCCGGAATCGCGGCCGCGTCGACCATTTTAGCCAAGGTGCCTCTATAGGGGTCA 
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AGACATCATGTGCCCAGACCTAAGGTCAGGAATGTCATATTTTTCTGTTAAAATCA 
TTTTATTTCTGTGTATCTTACCTTTAAATCATTGTGGTTTACTCTGAGATTCTGTA 
GTCCTAATATTGTATCATTGTGCTGTCTGCAAAACAACTTGAATCTATTTTGTTTG 
CATCTTTTGTTACATGTAACGCAGCTGTACTTTATGTTCTTTGCAACTGTTTCCAT 
TATGAGAACGCTGTGCTATTTACAAGGTTACATTTTTCTTGGCCAGGCGAGGTGGT 
CATGCCTGTGATCCCAGCACTTTGGGAGGCCAAGGTGGGCGGATCACTTGAGGTAA 
AGAGTTGAGACCAGCCTGGCTAGCATGGCGAAGCCCAGTCTCTACTAAAAATACAA 
AAATTGGCCGGGTGAAATTAGCCGGGCGTGGTGGTGTGTGCTTGTAATCCCAGCTA 
CTCGGGAGGCTGAGGCAGGAGAATCGCTTGAATCCGGGAGGCAGAGGTTGCAGTGA 
GCCAAGATCANGCCACTGCACTCCACCTCGGGGTCAAGAGCGAAACTCTGTCTCAA 

Sequence ID 395 SEQ ID NO: 58 

CCGTTTTAGTCAGGATGGTCTCGATCTCCTGACCTCGTGATCCGCCTGCCTCGGCC 
TCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCGTAAATCAGGTTT 
TTTAAATGTTTGCCAAACCTTATCACTGACTTTTATAACAAAATTATTTACTATAA 
TCATTAGGGAATATTTAAGTTCTGCTAATACTTAAAATTGCAGAGTGCTAAAACCA 
GCAGTGAGTTTAGAATCAAGCTAAGCTTTATTGTTGCTACTATTTGAGGCATATTA 
GTTGACTGGTGTTCATATGCAAGGCAGTCTACTGGGTGCAACAAGGGTTAGAAGGA 
TATTTTTAAAAAACTGACCCTATTCTCAGGATGAAAATAATACACTAGTAATAGTC 
TGCTCTGTTGGTTAACTCCTCGTAAGGAGGTCAATTAAAATGCTGTAGTGTTGCAA 
G G G AAG GAG AG G AAG AA T CAT AT T C C T T C AC T AGC AGG AT C AAGAAAGC T T T T AT A 
GAAATATACAAAATCTTCACTTCTTGAAGGATTGGTAAAATTTAATAGCCAACATT 
GGGCACTTATTCATTCTCTGAGTAAATATTTATTGCAT 

Sequence ID 396 SEQ ID NO: 59 

CTTAAATCTAAATGGACCACATTCTCTACTTAAAAAAATGCTATTAACCATGTGAT 
CTTCTCAGTCATGAGGTAATCTGGTGACTACCCTTCCTCAAAGCCAGTTGGGATAT 
T C T T T GAAT AG AGT AAAAC AG T GT T T C T AGGC T GGGAG AC AC C AG AC AT AGT T GAG 
G AC AGAGG T GC T AGAAAAT AGGAAG T T T AAAAGC AT GT GCGG T GAT GC T C AGAGG A 
GGTAAACCCCACCCTCATGCTCATAGCTTCCAATCATTTTCTCTAGTTCTTAACTC 
TTAAATGTGAGAAATGCTTGAAGATTACTAGTCATCTGAAGAAAGTCTCTTTATTA 
AAGATTTTCATAAAAGAGACCAAAGCAGACAAACAGAAAAAGACATCTTGGGGAAA 
AAAAC AAGGATAATGGGAAGAGAAGGAAAGTTTTAAAAATTATCAATATCCTC AGG 
GGGAC AAAAT AT TAT AT C C T AT AAAGAC AGAT TTTTATTTTT T AAAAAAA T AG AAA 
GC AAAAC AAGC T CC TAAAAA 
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Sequence ID 3-^ SEQ ID NO: 60 nt : 534 

GACCCGGAATCGCGGCCGCGTCGACGGAAGCTCCTGCCCCTCCTAAAGCTGAAGCC 
A AAG C G AAG G C T T T AAAG G C C AAG AAG G C AG T G T T G AAAG G T G T C C AC AG C C AC AA 
AAAGAAGGAGATCCGCACGTCACCCACCTTCCGGCGGCCGAAGACACTGCGACTCC 
GGAGAC AGCC C AAAT AT C C T C GGAAGAGCGC T CC C AGGAGAAAC AAGC T T GACC AC 
TATGCTATCATCAAGTTTCCGCTGACCACTGAGTCTGCCATGAAGAAGATAGAAGA 
CAACAACACACTTGTGTTCATTGTGGATGTTAAAGCCAACAAGCACCAGATTAAAC 
AGGCTGTGAAGAAGCTGTATGACATTGATGTGGCCAAGGTCAACACCCTGATTCGG 
CCTGATGGAGAGAAGAAGGCATATGTTCGACTGGCTCCTGATTACGATGCTTTGGA 
TGTTGCCAACAAAATTGGGATCATTTAAACTGAGTCCAGCTGCCTAATTCTGAATA 
TATATATATATATATATCTTTTCACCATAA 

Sequence ID 3-^- SEQ ID NO: 61 nt : 512 

GGGGAGCCCCCTCTTCCCTCAGTTGTTCCTACTCAGACTGTTGCACTCTAAACCTA 
GGGAGGTTGAAGAATGAGACCCTTAGGTTTTAACACGAATCCTGACACCACCATCT 
ATAGGGTCCCAACTTGGTTATTGTAGGCAACCTTCCCTCTCTCCTTGGTGAAGAAC 
ATCCCAAGCCAGAAAGAAGTTAACTACAGTGTTTTCCTTTGCACCGATCCCCACCC 
CAATTCAATCCCGGAAGGGACTTACTTAGGAAACCCTTCTTTACTAGATATCCTGG 
CCCCCTGGGCTTGTGAACACCTCCTAGCCACATCACTACAGTACAGTGAGTGACCC 
CAGCCTCCTGCCTACCCCAAGATGCCCCTCCCCACCCTGACCGTGCTAACTGTGTG 
TACATATATATTCTACATATATGTATATTAAAACTGCACTGCCATGTCTGCCCTTT 
TTTGTGGTGTCTAGCATTAACTTATTGTCTAGGCCAAAGCGGGGGTGGGAGGGGAA 
TGCCACAG 

Sequence ID 399 SEQ ID NO: 62 

TTTTGGCATTACTTAATCCAATTATAAAAACTGAATTTTTAAAAAACAGCACTTGT 
TTTTTCT T CC AAGAT TAAT T T GAAT TTTT T T AT GG AC AT T AGAAAAC AT T GC AGT T 
T AGT CAT AATCAAAAAT AAAT CTTGAGGCTGGTAGAGCAGCTTTGTTGCTGTT TAT 
ATTTTTATTGCTTACTGGATTTCAGTGTTACCTAGTGCCATCAGTTTGGTATTTTG 
CCACCTTGCAC ATT CAGT GAT GTTT GAT TTTTCTTTTTCCTTTTTTTCATAT TACT 
TTTAAATCCTGAATAGTTTGTGGCAGCTGGAGATCACCTAGTCCACCACTGTCCAA 
CATGGCAATGGTAAGTAATATTGAGTAAAGAATAGAAAATTAGTAAAATGCATGGC 
TTCAGAATTATAGCAATTTGCAAAATAGGTTAATGGATGAAAATTAGAATGACCAG 
TTTAACTTTCCCCCCAGCAGATTCTTCTGTTAAACAATGCCCCTTCAAAATAAAGG 
AAGAACAAGTGGGTGTTATACCTATGTTATTTGGCTATGTTAGCACAATATGATGG 
ACT AATTTGAGAAAAAGCATTT ACT TCCTT TACT ATT ACT TCTTTTCTTTATAGGG 
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Sequence ID 400 SEQ ID NO: 63 

GAAGAAGCGCGAAGAGCCGTTAGTCATGCCGGTGTGGTGGCGGCGGCGGAGACTGC 
GGGCCCGTAGCTGGGCTCTGCGAGGTGCAAGAAAGCCTTTGAGGTGAAGGTGTATG 
AAAGTCATCATAACAGATGTTTTCCAAAAACTTGTAGAAGGTTGTGAAAAAACTAC 
TAGGATCACGCGGCATGTATTGAGCATATAGGTTGCTGTAGATGAATGTTCTTAGC 
TGTCATGTTTAAAAATACTTCTGCTTCGTTACCTCAAGTGTGGCATGCAGCATTTT 
G G AAGG AAAAT T G AAG AC G T G T T C AAG AAAAC AT G AAC AG AAGC AAAT GAT GAAAA 
TGAGCATTTTACTTGATGTTGATAACATCACAATAAATTATGGAGAAAAATACATA 
T T T G G C T AAC T T T T AAT T G C T G AAC AAT AAAG TGTTTTCTTTT AAAT CN AAAAA 

Sequence ID 401 SEQ ID NO: 64 

GAAGCCAAACCAAAGGGAGCTTCTACTTCATGATGCCATTTATGTAAAGTTCAGGC 
AGAGAAAATCAGTGGTTTAAGAAGTTAGAATAATGATTATCTTTGGAGGGATTGCA 
ACTGGAAGAAGTCATGATTGGGATTTCTGGGTCCTAATAGTGCTCTGTGTCTTGAT 
CTGAGTGCCGACTACATGAGTGGTTAGGTTTGCAAAATTCATTGAGTTATGCACTT 
AATGGTGTTGTCTTATTAGAGCTGATGGAGGAGAGAGGGCTTCAATTTGCACAACT 
GAGTAATCAGCTAGGCCCAGTCACTAGGTGAACAACTTACTGCTACCAATCAGCCT 
TAGAGCAGGAATCAAACTCATGTCTCAGAAAAGTTATTAATTCAGCTTGTCTTGGG 
AC T T CC T T C AG AGT C AC T C T T GAAT AGC T GAAAT AGT AAAT G T T AAAT C T GT GGAT 
GCAAGTGTGTAAATTATTTTAGTCATCAGCTCTAATAAGATGGCCTTTGGGGAAAT 
GAGTATAAGGTCACGAAAATGAAATGGCAAGAAGGAGGTCTACTATTTCTTCTGTA 
ATACTGATTTTTACCCCATCAGGGTCAGTCCCCAAAGGTTGTAAATGTGAAGCTTG 
GTCTTTTTCTTTA 

Sequence ID 4Q2 SEQ ID NO: 65 

GACCCTATTCTCAGGATGAAAATAATACACTAGTAATAGTCTGCTCTGTTGGTTAA 
CTCCTCGT AAG GAG G T AC AAT T AAAAT G C T G T AG T G T T G C AAG G G AAG G AG AG G AA 
GAAT CAT AT T C C T T C AC T AGC AGGAT C AAGAAAGC T T T T AT AGAAAT AT AC AAAAT 
CTTC ACT TCTTGAAGGATTGGT AAAAT TT AAT AGCCAACATTGGGCACTT AT TCAT 
TCTCTGAGTAAATATTTATTGCATGCTTATCTTGTATCAACATTGNGATGAAAGCN 
CAAGAAT G A A AG AG G AG G GAG A A T G T T T AN AG A A T AAG G C T GAAAC AC AGAT T T T G 
TAGGGAGCGTAGGGGAGACTGAN AAAAC AG 



Sequence ID 403 SEQ ID NO: 66 
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AAGACACCTGATAGATTGTCTTGTATTATTTTTCCTTTGCCTTCTTACAATCTCAG 
TGATTAGAATTGGGCTGAAAACAATACATCAAATTCTCAGCAAAATCCTTATGGGT 
T GC T GGAT AC C GAGGGT T T T T AAGAT C T T T AGAC T T C AC T AT AT AGAAC AAAT GT T 
GAATGGGAATTTTCTTTATTTCTATANCGTTTNG 

Sequence ID 4Q5 SEQ ID NO: 67 

CCCGGAATCGCGGCCGCGTCGACGATGAGCATTTTTTCATGTGTCTTTTGGCTGCA 
TAAATGTCTTCTTTTGAGAAGTGTCGGTTCATATCCTTTGCCCACTTTTTGATGGG 
GTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCC 
TTTGTCAGATGAGTAGGTTGCGAAAATTTTCTCCCATTTTGTAGGTTGCCTGTTCA 
CTCTGATGGTAGTTTCATTTGCTGTGCAGAAGCTCTTTAGTTTAATTAGATCCCAT 
TTGTCAATTTTGGCTTTTGTTGCCATTGCTTTTGGTGTTTTAGACTTGAAGTCCTT 
GCCCATGCCTATGTCCTGAATGGTAATGCCTAGGTTTTCTTCTAGGGTTTTGATGG 
TTTTAGGTCTAACGTTTCAGTCTTTAATCCATCTTTTAAAAGTCTCTTCACAGTAC 
ATGAGTAGTAGTGACACCAATAATGTCAGAGCAGGGAACTCCCAGGTTCTGCCCAT 
CCACAAAAACAACAAATAAGCTGGCAAAAACTTTAAGAATCAACTTTTGCAGATCT 
CTGAAATCTAGTCAAAACTTAAACAGAGGAAAGATTAATAAAGACNGGCTGCCTGA 
GAT AAC AC T AAC AC AC AC 

Sequence ID 406 SEQ ID NO: 68 

CAT C AAAT AAAT AAAT AAAT AAAT T T T AAAAGT C AC AGC AT T GAAT T T T T AAAT GT 
TTGGGATGATAAAGCACCTGCTTATCATGAAGCTANAGAAATTCAATGACACGTTT 
GCCAGGGTCTTTGCTAGTGATGTTGGAACAAGTCTGTAATGCTGATGAAACATCAC 
TGTTCGGGCATTATTGCCCCAGAAAGACACTGACTGCAGCTGATGAAACAGCCCTT 
CCAAGAATTAAGGATGCCAAAGACCAAATAACTGTGCTGAGATATACTTACGCAGC 
AGGCATGCATAAGTGTAAACTTGCTGTTATAAGCAAAAGCTTGCGTTCTCACTGTT 
T T C AAGGAGT G AAT T T CAT AC C AAT CC AT TAT T AT GC T AAT AAAAAGGC AT GGAT C 
ACCAGGGACATCTTTTCAGATTGGTTTCACAAACATTTTGTACCAGCAGCTTGTGC 
TT ACT GCAGGGAAGCTGACT GGAT GAT GACTGC AAGAT TTTGTT AT ATCTTAACAA 
CTGTTGTGCTCATCCTCCAGCTGAAATTCTCATCAAAAATAATGTTTATGGCTCAC 
ACCTGTAATCTCAACACTTTGGGAGGATTGCCTGACCCAGGAGTTCAAGCCCACCC 
TGGGCAACACAGCAAGACCCAACCTNTC 

Sequence ID 407 SEQ ID NO: 69 

T T T T AAAAAT C AT AAAAC GT T T C T T AC AAAAGAGC AT T AC AT TNT GC AC AC T GC T C 
TGAACAGATGCCAGGGACATGTGGACTATTGTTACTTTTCCTCCCTGTCCCACCCC 
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CCAAATGTTACAGTGACCACAAAGCAAGGTGTTCACAATAATTACATGGGGGGAAT 
TTTTTAAACCACCAACAATAACGAAAAATAAAATCCACTCACTCTGCTGCTGTTTC 
AAAATTTCAATGTTAGTTTTTGCACGCCCTTCCCCCCCCCAACCCTGTTTGTAAGG 
AAC T AAAAC AT T AC AT C T GGT GAAC AGC AAAGAT T T C AC T AC AC C T C AAAT GC AG A 
AC AC C T AT GAAGC AGAGG AAT GT T GGC T T T T T AAAC AG AAGC AGAT AAAAAAAAAA 
G AT GC AGG AC T C C T T C AG TTCTTCACTAGTCT T AG AAAAAC T T T C C AG AAT AC T GC 
TTCACACTATAAAAAAGAAAAAATATCTTGCATTAGAATCCTTCAACATCTGCATA 
CTGCTTCACACTGTTCGTTTCTAGGAGCACTTTGTCACAGGACACTTCTGCTTATA 
TTTCTTTAATCAGAACTTAGTTGGATGGGCCGGGCATGGTGGCTCACGCCTGTAAT 
CCCAGCACTTTGGGAGGCCGAGG-GGGTGGATCACC 

Sequence ID 4Q8 SEQ ID NO: 70 

CCATCTCCAAATTTAGTATTCATTCTGTTTAGCATATTATCAGTTGCCATCTATTT 
GTTTTAACTGATTACTTGAATCTGATTAAACATCACAGAAATGGGCTTTGATAAGA 
AC AAT AT T G AAT AAG AAAT T T T AAAT AAC AAAAC AGC T T AT AGAAAAAT T C AGC AT 
AACTTTTCCATCACCTTCACCACCCTTGCCTTTTATTATCCTGTCCTGTATCACTG 
CTTTCTGTTAGCAGTGTTGTGTGAGTTAGGATTTGGGCAGGAAAGCAAAAGCAACC 
ACCCGTCATTTTCCCAGAATGAAGGGTTTGACGTAGGATGTAGACTTTGTATAGTA 
GTTGGGAGAGCTGTGGGAGTGAAGGTCAGGGATGTCACCTACAGAAGTCAGGGAAT 
C T GC C AC C AG AG AT C C T GC AT C AG AAAC AGC C AAC AGC G T GC T T C T G AAG AAC TAG 
TGGGGAAGTGGCTATAATTCTTAGGAATCCCAGCAAGTCCGCACCACTGTCTCAGT 
CTACAGCAGTGGAGAAAGGGGTTTCCAGGAGCTCTCTGGAAAGTTCCTGCCCACAC 
TTTGCAACAATCTTCAGAGGATAATGGGCTTCTCTTCCAGCTTCCACACCCAACAA 
GAGTGCCTTTCATCGGCCAACTCTAACCTGGAACCCTATGGCAGAGGGGATTTAGG 
AGAC AGT T T GTNAT GT CTGTGGAATGC AAAT GAANANGTANC AAT GCTT ANT TGAC 
AGCGGNCATACACAAATNTNGAAA 

Sequence ID 409 SEQ ID NO: 71 
GATCCGTNGACT 

Sequence ID 110 SEQ ID NO: 72 

CTCTTCCCAGCCCCTGAGCCCAGCCCCTTCCCAAGTGGTGCCAGACAAAAAACTAC 
ATGGCCCTTTCGTGTCTTGGGGGTGGAAAGGGAGGGATGAATTGGGGTGATAGAAC 
CCTGGTGAATTCAGAGTAATCTTTCTTTAGAAAACTGGTGTTTTCTAAAGAAACAG 
GAT AGGAGT T T AGAGAAGGC ACC AAAGC T T T C AC T T T GGT T T GGC ACC AGT T T C T A 
ACCATCTGTTTTTTCTACCCTAGCTATCTTTTATTGGTAAAATATAAATGTATAAT 
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TATGTTTGTAGAGCTTTACCAAGGAGTTTCCCTCCTTTTTTGTTTGTTGATTAGCA 
AATTTTTGATTCTCCATTTTCCAAAAGTAAGAGACTCCAGCATGGCCTTCTGTTTG 
CCCCGCAGTAAAGTAACTTCCATATAAAATGGTATTTGAAAGTGAGAGTTCATGAC 
AACAGACCGTTTTCCATTTCATCTGTATTTTATCTCCGTGACTCCACTTGTGGGTT 
T 

Sequence ID ■ ^r SEQ ID NO: 73 nt : 50 5 

TGGAGCTGAAAAATTCCTATTACCTAGGGGCATCACAACGCATTGCATTTCGCCCG 
TGTTTGGGATGATGCTGGTGTAAACCTACTATGCTGCCAGTCATGTAAAAGTATAG 
CACACACAATTAGTAGGTAATGCTTGCAAATAATAATGAAAGACTCTGCTACTGGT 
TTATGTATTTACTATGCTATACTTTTTGTCATTACTTTAGAGTGTACTCCTACTTT 
TTTTTTTTTTTTTTTT GAGAT GGAG T T T C AC TCTTGTCC T GT AGGC T GGAGC GAAN 
TGGCGCGATCTCGGCTTACTGCAACCTCCACCTCCTGGGTTCAAGCGATTCTCCTG 
CC T CANCTTCCCAGAGTAGCT GAGAT T AC AGGC AT GCACCGCCACGCACGGGTAAT 
TTTGTATTTTTGGTAGAGACAGGGTTTCACCATGTTGGCCAGGCTGGTCACCAACT 
CCTGACCTCAGGTGACCCGCCTCCTCACCTCCAGAGTGTTGGGATTACAGGNGTGA 
G 

Sequence ID 412 SEQ ID NO: 74 

ATAAAAATTAGCTGGGGGTGATGGGCCCTGTACCCCAGCTACTCGGGAGGTGAGGT 
AGGAGAATCACTTGAACCCGGGAGATGGAGGTTGCAGTGAGCCAAGATCGTGCCAC 
TGCACTCCAGCCTGTGTGACAGAACAAGACTCTGTCTCAAAAAAAAATAATAATAA 
TAATAATAATAAAAAGGAATAACATAGCTAGGAATAAATTTAATCAAAGAGGTGAA 
AGACTTATACACTTAAAACTACAAAAAAAAAATCACTGAAGGAATTATAGACCCAA 
ATAAAAATAAATAAAAAGACATTCTGTGTTTTAGGGAAAGAAGACTTAATATTGTT 
AAGATGTCAATACTACCCAAAGTGATCTACAGATTCAACATAATCCCTATCAAAAT 
TCCAACAGCCTACTTTGTAGAAATGGAAAAGCCAATTTTCAAATTCAGATGGAATT 
GCGAGGGGT T C T G AAT AAC AAAAC AC AAT C T T G G G G A A A A A A A AC A A A A A AC A A AG 
T C AAAGAAC T C AC AC T T C T C T AT T T AT AAT T T AC T AC AAAGT T AT AGN AT C AAAG T 
CGACGCGCCGCGATCCGGGC 

Sequence ID 413 SEQ ID NO: 75 

CACAGTACTCCATTTTGGGGTCCAAACTGTAATGCTCAAAATAATAAATGCTTACA 
CGAAAAT TAT T T AT T GAG AAT AT TCAT AT AAAAAT T AC CT AAAG C AAAGT AAAAAA 
AGTAAAATCAAGGTGGTATATTTGAAGTGAATGGTGATTGGAAATTTTTAGCTGTA 
AC AAAAAG AAAG AAAAC AAC T T T T T T TAAAGCC TCATTCTCTTTTCTTT CAAAAT G 
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TACCTTATTCCCACACACTCTTGGGCTGACCTTTATTTTATCAATAAGCTCAATAT 
TACTTTGTTTAAAATAAGATGCTTCAGCAAAAGTCATTCTCTCTTTAACCATATAA 
T T T AAAAAC TCCTCTTCAC GAT T GAT AG C AAAA T C AG AAAC G T T AG G G C AC C AG T G 
A G T T G A A A A A AC T G G T C T T A AG T T G G A A A A AC T A T T A T T A A T A A TATTATCCTATC 
CATCCATATCTATTGAAATTGTCAGGTCCATAATTTCATTTTAATTAATTATAGGA 
AAG AAG AAAAG AT AAT AC C C AT TTGTTCTAT 

Sequence ID 414 SEEQ ID NO: 76 

CTCAGACTCTTTCTGCCCTAATGGCCATTACTATCCAGTCTGTATTGCTACAAGGG 
ACCCACTGGTACCCCTTTTAGATTCTATCAAAAGGAACAGGGTTTTCCTAGAGGCA 
GGCAGCCTGGTGGTATGGCACAGCAGAAGCTTACTGCTAATGAAATGGGAACCTCC 
CCCTCCCTTGTGGTTTCAGCACAGAACCTGAATGCCAGGAAAAATTCCTGGGCCAA 
G AAG C T AAAGC T AAAG AAAC CTTCCTTTTTT C AAC GTTTTTTTTTCTTT C AAAC T G 
TAGGGTCACTTTTGATTGAGGCAAAGGGGTCCTACTGTAAGTGGAAAAGACTCACT 
CCCCTAACATAAGTTTTCACTGTGGTGGGATGGTGCCGCCCGATATGCTTGATATG 
CTTTTCCTTCCACATGTTAAGCTAGGAAACCTAACAGGATGTCAGCAGGGCAGTTA 
ACTCTGGACTCANAGCCCTCAAGGGCATGTGGCANAACCTCATGGCATNCAAGACC 
A 

Sequence ID — 115 SEQ ID NO: 77 nt : 5 96 

G TAT AAT T GAT T C T T T T G AAC C T AAAG TAT AAG AC T T C AC GAT T AGAAAAAAAT T A 
TCCAAAGACTAATGTAATTAAGTGAGGAAAAGGTGCTGGAGGAACTGGATAACCAC 
ATGGAAATGTATGAACCATGACCTCTATGTCACATACTATATATAAAACTTAATTT 
GAGGTGTATCACAGAGCTAACTGTGGGGGCTAAAACGTTGAAGCCTTTGGATGGCC 
GCACAAGAGATGTCTGCATTCATAACCTTGGGGAGGGTATGAACATTTCTTGGTAA 
CATGCAAAAAGCACTAACTGTAAAAGAGAACAGTTGGTCAGTTGAATTTCATGAAA 
C A T T G T AAAC T T C T G C T A A AC A AC TGACACCATT A AG A A T G T G G A A A A AG G C T G G G 
CACAGTGGCTCATGCCTATAATCCCAGCATTTTGGGAGGCCGGGGCGGGAGAATCA 
C T T G AGGC C AGGAGT T T G AAACC AGCC T GGGC AAC AT GGC AAGAC CC C GAC T C T AC 
AAAAATATTTTTAAAAATTAGTTGGGTGTGGTGATGCACTCCTGTAGTCCTAGCTG 
CCAGGANGCTAAGGNGGAAGGATCACTTAACCCTGG 

Sequence ID 416 SEQ ID NO: 78 

CTGGTGGCGGCGGTCGTGCGGACGCAAACATGCAGATCTTTGTGAAGACCCTCACT 
G GC AAAAC CAT C AC C C T T GAG G T C G AGC C C AG T GAC AC CAT T GAG AAT G T C AAAG C 
CAAAATTCAAGACAAGGAGGGTATCCCACCTGACCAGCAGCGTCTGATATTTGCCG 
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GCAAACAGCTGGAGGATGGCCGCACTCTCTCAGACTACAACATCCAGAAAGAGTCC 
ACCCTGCACCTGGTGTTGCGCCTGCGAGGTGGCATTATTGAGCCTTCTCTCCGCCA 
GCTTGCCCAGAAATACAACTGCGACAAGATGATCTGCCGCAAGTGCTATGCTCGCC 
TTCACCCTCGTGCTGTCAACTGCCGCAAGAAGAAGTGTGGTCACACCAACAACCTG 
C G T C C C AAG AAG AAG G T C AAAT AAG GTTGTTCTTTCCTT G AAG G G C AG CCTCCTGC 
CCAGGCCCCGTGGCCCTGGAGCCTCAATAAAGTGTCCCTTTCATTGACTGGAGCAG 

Sequence ID 117 SEQ ID NO: 79 

GCAGGGGCTTCTGCTGAGGGGGCAGGCGGAGCTTGAGGAAACCCGCAGATAAGTTT 
TTTTCTCTTTGAAAGATAGAGATTAATACAACTACTTAAAAAATATAGTCAATAGG 
T T AC T AAGAT AT T GC T T AGCGT T AAGT T T T T AACGT AAT T T T AAT AGC T T AAGAT T 
T T AAGAGAAAAT AT GAAGAC T T AGAAGAGT AGC AT GAGGAAGGAAAAGAT AAAAGG 
T T T C T AAAAC AT GACGGAGGT T GAG AT G AAGC T T C T T C AT GGAGT AAAAAAT GT AT 
TTAAAAGAAAATTGAGAGAAAGGACTACAGAGCCCCGAATTAATACCAATAGAAGG 
GCAATGCTTTTAGATTAAAATGAAGGTGACTTAAACAGCTTAAAGTTTAGTTTAAA 
AGTTGTAGGTGATTAAAATAATTTGAAGGCGATCTTTTAAAAAGAGATTAAACCGA 
AGGTGATTAAAAGACCTTGAAATCCATGACGCAGGGAGAATTGCGTCATTTAAAGC 
CTAGTTAACGCATTTACTAAACGCAGACCAAAATGGAAAGATTAATTGGGAGTGGT 
AGGA 

Sequence ID 418 SEQ ID NO: 80 

CCCGGAATCGCGGCCGCGTCGACGGGAGGTGATAGCATTGCTTTCGTGT AAAT TAT 
GTAATGCAAAATTTTTTTAATCTTCGCCTTAATACTTTTTTATTTTGTTTTATTTT 
GAATGATGAGCCTTCGTGCCCCCCCTTCCCCCTTTTTTGTCCCCCAACTTGAGATG 
TATGAAGGCTTTTGGTCTCCCTGGGAGTGGGTGGAGGCAGCCAGGGCTTACCTGTA 
C AC T G AC T T GAG AC C AG T T G AAT AAAAG T GC AC AC C T T AT AAAAAA 

Sequence ID 419 SEQ ID NO: 81 

CCCGGAATCGCGGCCGCGTCGACGGGAGGTGATAGCATTGCTTTCGTGT AAAT TAT 
GTAATGCAAAATTTTTTTAATCTTCGCCTTAATACTTTTTTATTTTGTTTTATTTT 
GAATGATGAGCCTTCGTGCCCCCCCTTCCCCCTTTTTTGTCCCCCAACTTGAGATG 
TATGAAGGCTTTTGGTCTCCCTGGGAGTGGGTGGAGGCAGCCAGGGCTTACCTGTA 
C AC T G AC T T GAG AC C AG T T G AAT AAAAG T GC AC AC C T T AT AAAA 

Sequence ID 420 SEQ ID NO: 82 

CTTCATTTGAAATGGTTGAATCTGCTGTGTAATAAAGTGGTTCAACCATGATTAGG 
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AACTGAAATTTAGTAGAAGAGGGAAAAGGAGTTAATGTAACAAATTATTTTAGCTA 
CAAACCCCGGTAATAGAGCACTTGGGGGATGGGATGGGGTGGGTTGGTGAGACAAT 
CAGAATGGTAAATTGATTAAATGCTCCTAACCCTGTAATTTTGTGCATAGAGCACC 
CTATGCTGTGGAAATAACTGTTCTTAGATTTCATTGTAACTGGACTGTTCAGGTTG 
CCC AGAGGGAAAGAAC AT T CC T AAT T C T AAT AAAAT AAAC TTTTATTTTGTTTA 

Sequence ID 421 SEEQ ID NO: 83 

TGTCATTGAATCTGCTTGTTACTTAAATGCTAAACTCAATTCTGTAATTCAATAGG 
TGCACCTCTCTGAGAAACATAAGAGACAATGAGGAAAAGGATTCAGCATTCCGTGG 
AATTTGTACCATGATCAGTGTGAATCCCAGTGGCGTAATCCAAGTAAGATGTTCAC 
A A AG AT T T GT T T T T AAT G T C T AAT T AAT AAAAT T T T AAAGGAAGAAAC AT T C T AAT 
A C T T T A A T T A T A A A A AG T T A AC T A T T T T C A A A G G T A T C A A A A T AC AG T T AAAC C T T 
TAAAATGTATATTTCTTAATATCTTGAAATTGTAATGCCTTTTTTTTTTCCTAAAT 
TTTTTTTGTCATGAAATGAGATAGTAACAGCAGATTGGGACAACAAGGTTATATTC 
TTGTCTTGAATCAGGCCATGGCTTCTTTCATCCAAATTTCAGACCTCATTTATTTA 
CTTTGTCCCTGCCTCCCATCCCTGGATATCAGTTTGTGGATATCTACAGTTAATAG 
AGTGACCAAATAGTAGGAATACTGTCTCTCT AT TCTGAAT AAAAT CTTTGAATCAG 
ATTTAGAAATAATGAATAAAATACAAATCAGCCATTGAAATTGCTCTAATTTTGAG 
AGCTTATGATTTATTCATCTTTGGTTTCCAAGTTCAAGTTATATGTAGACATTTTA 
ATT 

Sequence ID 422 SEQ ID NO: 84 

GCTTCCTAGGTGAGGTCACGAGGAAACCTGCTGGCCAAGTGACCTGGCAGGGTGTG 
GCCAGTGTGGCCAGGGCCGCCGAGCCTGCTTTCCTTCCCTGCAGCAGGAACCCTTC 
TGGGGCTGTGATCCTGCGATGGTGCCTGGGTGGGAGTGGGGGTGGGGGGCGGGATG 
GTCTCCCTACCTGCCAGCTTCTTGGTTTGAGGTGAGGACAGCCCCGGAAGCTCANA 
CTTGGCTCCTGTCCATGTACTTGGGGCCATGAGCTCTGCAGGGACCTTGGAAAGAN 
AG AG AC GGGTGGTGT AN G G C AN G G G AAG GCATTGTCTT C AAAC AG G AAAAAG C T G A 
N AAT GG AAAC AGGC G AAAC T T ACC AAGT GT AAC AT C AC C T GGAAC T GAAGGAGGG T 
GGGAAGGTTTTAATTATTTTAAAAATAGAGATGGGGTCTCACTATGTTGCCCAGGC 
TGGTCTCAAACTACTGGGCTCAAGTGAACCTCCTTCT 

Sequence ID 4^3- SEQ ID NO: 85 nt : 38 7 

TGTTTCTCNAGGGCGAGAGGCTGTCTTANAGCACCATTCTCTGGCCCTNGTCCCAT 
GAGAAGGAACCGCACTCAGGAGCCACACTCTCCCACTNCCCTTGCCCANAAGACTC 
ACAGAGGGCACGGAGCTGGCTGTGGTGAGAGGAGGTCCANCAAATTCCTGTCTGCA 
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NAAGGGTTCTGAACACCACCGCCTGGCAGCGTGCTGGAGGAGGGATTCCTCTTTTC 
CTCACAGCAATTCTGACCAGAAACCTGTCAAATCAGGAATGGCTAAAATAAGACCA 
GGGTATGAATGACCATCAGCCACAGTAAAACCAAGGCACAGCTCTCCTGAGCCCAC 
C C A AG CTGCTGTGGCCCAGACTGGTGACATCACCTCAGGGC A A A A A A A A A A 

Sequence ID 4hS-4 SEQ ID NO: 86 nt : 420 

CGCAGAATGGCTCCCGCAAAGAAGGGTGGCGAGAAGAAAAAGGGCCGTTCTGCCAT 
CAACGAAGTGGTAACCCGAGAATACACCATCAACATTCACAAGCGCATCCATGGAG 
TGGGCTTCAAGAAGCGTGCACCTCGGGCACTCAAAGAGATTCGGAAATTTGCCATG 
AAGGAGATGGGAACTCCAGATGTGCGCATTGACACCAGGCTCAACAAAGCTGTCTG 
GGCCAAAGGAATAAGGAATGTGCCATACCGAATCCGTGTGCGGCTGTCCAGAAAAC 
GT AAT GAGGAT GAAGAT T C ACC AAAT AAGC T AT AT AC T T T GGT T ACC T AT GT ACC T 
G T T ACC AC T T T C AAAAAT C T AC AGAC AG T C AAT GT GGAT GAGAAC T AAT CGC T GAT 
CGTCAGATCAAATAAAGTTATAAAATTG 

Sequence ID 425 SEQ ID NO: 87 

GGAAACT GAT GCCAGTCAGAAACTCAGATC AAAT GAAGGGGTGAAGAGAACCAGAA 
TTGATCTCTCTGTAGGAGAATATAAATGACTTTTTTAAAGTACATATTTTCTGTGA 
AAGAC AGT T T T T T GT T T AAT GC AAAAAT GT TAAC AAT GT T TAT AT C AT GT AGAAGT 
AAAAGATCGTGAAACAGCACAGAGAACAGTAGTAAGACAGATTGAATTGCACTGTT 
GTAAGATGATGAACTTACAATATTAAGTGAAGGTAGACTGTGATAGATTAAGGATA 
TATATTGTAATCCCTAGAGCAATTGTCAAAGTGGTACAGGTAAAAAGCCAATAGAG 
GTGATAAAATGGAATACTAAAAAATATCAGATGAATAATAAAGAAGACAGGAAATG 
AGGAACAGTGGAACAGAATGAATAAAAAACAAGACCATTAACTTAATCATTAATAA 
TTACTTTAAATGGGTTAAACATTATGGTTATAAGGCAGAGATTTTCAGACTAGATA 
AAAGAGCAAGCTCCACTATATACTGTCTACAAGAGATATACTTTAAAGTGTATATT 
AT AT T T AAAT AT AAAGAT T T GGAAT AAAT AAACC TAAGAAT AAGC T T AC T AGGGAA 
G T GAAAGAT C T GT AC AAC AAG AAT T AC AAAAC AC T GC T GAAC GAAAT C AT AGGT G A 
CCA 

Sequence ID 426 SEQ ID NO: 88 

GTCCCGGAATCGCGGCCGCGTCGACGTTTCCTCAAAATTTATCTTCCTGTTAATGT 
CAGGCATGTATCTCCTTAGCTTGCCACAAATAACTATATATACCACAGACCTTCCT 
TTGTAGGGCTAACAGTGTTGCATTGTAAGTGGAGGCCTCATAGATACCTGGCCTTT 
TCCTACCTTATTCCAAAGATGGTTGCATCTTATAAATAATGTCATTCTTCAGCAAA 
T GGT AT GGAAAT GAGAT T GT AAT GT C AT TAT TTCCTCTT T AAAT AAT C AGGAC AAC 
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TCATGATACAAAGAGCTCTTCTCTATAAAAGGTGGGACTTTTTTTTTTAGTAATAG 
CAAAAATAAAATTGTACCTCCTTAATCTTCTACAGAAAGATGGATTTCATTTTCAA 
CATTAAGAGGTAGTTTTAAGAAGCAGTAGAAGTCAGCCTGGGCAGCATGGTGAAAC 
CCCGTCTCTACAAAAAAGTTAGCTGGGCTTAGTAGTTGCAATCCCAGCTACTCTGG 
AGGCTGAGGTTGGAGATCATCTGANCCTGGGGAGGTCNAGGCTGCAATGATACANT 
GAGCCCTGATTGTGCCACTCCACCTGGTTGCAGA 

Sequence ID 427 SEQ ID NO: 89 

TTCCAATCTTCGTGTTCACTTTAAGAACACTCGTGAAACTGCTCAGGCCATCAAGG 
GTATGCATATACGAAAAGCCACGAAGTATCTGAAAGATGTCACTTTACAGAAACAG 
TGTGTACCATTCCGACGTTACAATGGTGGAGTTGGCAGGTGTGCGCAGGCCAAGCA 
ATGGGGCTGGACACAAGGTCGGTGGCCCAAAAAGAGTGCTGAATTTTTGCTGCACA 
T GC T T AAAAAC GC AGAGAGT AAT GC T GAAC T T AAGGGT T T AGAT G T AGAT T C T C T G 
GTCATTGAGCATATCCAAGTGAACAAAGCACCTAAGATGCGCCGCCGGACCTACAG 
AGCTCATGGTCGGATTAACCCATACATGAGCTCTCCCTGCCACATTGAGATGATCC 
T T AC G G A A A AG GAAC AG A T T G T T C C T A A AC C AG A A GAG GAGGTTGCC C AG A AG A A A 
AAGATATCCCAGAAGAAACTGAAGAAACAAAAACTTATGGCACGGGAGTAAATTCA 
GC AT T AAAAT AAAT GT AAT TAAAAGG 



Sequence ID 428 SEQ ID NO: 90 

TGCAGGATCCGTCGACTCTAGATAACATGGCTAGAAAAGAGAATGAAAAAGTTGGA 
ATTTTTAATTGCCATGGTATGGGGGGTAATCAGGTTTTCTCTTATACTGCCAACAA 
AGAAATTAGAACAGATGACCTTTGCTTGGATGTTTCCAAACTTAATGGCCCAGTTA 
CAATGCTCAAATGCCACCACCTAAAAGGCAACCAACTCTGGGAGTATGACCCAGTG 
AAATTAACCCTGCAGCATGTGAACAGTAATCAGTGCCTGGATAAAGCCACAGAAGA 
GGATAGCCAGGTGCCCAGCATTAGAGACTGCAATGGAAGTCGGTCCCAGCAGTGGC 
T T C T T C G AAAC G T C AC C C T GC C AG AAAT AT T C T GAG AC C AAAT T T 

Sequence ID 4^-£ SEQ ID NO: 91 nt : 535 

CACAGTACTCCATTTTGGGGTCCAAACTGTAATGCTCAAAATAATAAATGCTTACA 
CG AAAAT TAT T T AT T GAG AAT AT TCAT AT AAAAAT T AC C T A A AG C A A AG T A AAA A A 
AGTAAAATCAAGGTGGTATATTTGAAGTGAATGGTGATTGGAAATTTTTAGCTGTA 
AC AAAAAG AAAG AAAAC AAC T T T T T T TAAAGCC T CAT TCTCTTTTCTTT C AAAAT G 
TACCTTATTCCCACACACTCTTGGGCTGACCTTTATTTTATCAATAAGCTCAATAT 
TACTTTGTTTAAAATAAGATGCTTCAGCAAAAGTCATTCTCTCTTTAACCATATAA 
TTTAAAAACTCCTCTTCACGATTGATAGCAAAATCAGAAACGTTAGGGCACCAGTG 
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AGTTGAAAAAACTGGTCTTAAGTTGGAAAAACTATTATTAATAATATTATCCTATC 
CATCCATATCTATTGAAATTGTCAGGTCCATAATTTCATTTTAATTAATTATAGGA 
AAGAAGAAAAG AT AAT AC CC AT TTGTTCTAT 

Sequence ID 43Q SEQ ID NO: 92 

CAGGGGCTTCTGCTGAGGGGGCAGGCGGAGCTTGAGGAAACCGCAGATAAGTTTTT 
TTCTCTTTGAAAGATAGAGATTAATACAACTACTTAAAAAATATAGTCAATAGGTT 
ACTAAGATATTGCTTAGCGTTAAGTTTTTAACGTAATTTTAATAGCTTAAGATTTT 
AAGAGAAAATATGAAGACTTAGAAGAGTAGCATGAGGAAGGAAAAGATAAAAGGTT 
TCTAAAACATGACGGAGGTTGAGATGAAGCTTCTTCATGGAGTAAAAAATGTATTT 
AAAAGAAAATTGAGAGAAAGGACTACAGAGCCCCGAATTAATACCAATAGAAGGGC 
AAT GC T T T T AG AT T AAAAT GAAGGT GAC T T AAAC AGC T T AAAGT T TAG T T T AAAAG 
T T GT AGGT GAT T AAAAT AAT T T GAAGGCGAT C T T T T AAAAAGAGAT T AAACCGAAG 
GTGATTAAAAGACCTTGAAATCCATGACGCAGGGAGAATTGCGTCATTTAAAGCCT 
AGTTAACGCATTTACTAAACGCAGACGAAAATGGAAAGATTAATTGGGAGTGGTAG 
GAT G AAAC AAT T T GG AG AAG AT AG AAG T T T GAAG TGGAAAAC T GG AAG AC AG AAG T 
ACC 

Sequence ID 431 SEQ ID NO: 93 

CGCTGGGTGCCTGCAGCGCCTCCCTTGTCTCATATGGTGTGTCCAGCACTCTATTG 
T T GT AAAC T GT T GNT T T GNC T GAC C T AAAT TN T C T T T AC T AAAC AN AT T T AAT AG T 
T N A A A A A AAA A A A AN AN C A 

Sequence ID 432 SEQ ID NO: 94 

TTTTAAAGTCATCTCTATAGGAAGGTGCTGGGCAGGGATCCCAGAGAAAGAAAGGG 
TCCAAGACTCCATTAACTGCCCTGGATGAAGGGCACTGCTACAGCAGCTAGTACCA 
GAGAC T C T CC T AT C T C ACGGT T GAGGC AGACCC AGGAT AGAAT AGAGAAT AAAAGG 
AATGCTTATAGGAAACAATTTTGTATGGAATGCTAGATGGCCAAGCCTCAGCCTTT 
G G T C C AG T G C AAC CCTTGCCTCGCTTGT C AAC AG T G AAAAAT T AG T T T G G T T AG AA 
GAACCATCTGGAAACACACCAGCTTCTGCTACCTTCATGCTCATTGTTAAAAAAAG 
ATTAACCAGTGTGAACATTCTGATCTGTTAATTCCAGGGACTGTTTTCTTTCCAAT 
GGAC T GT T T GT T GGT AGAAT AACCCCCAAAAGCTCAAAGCT AAAAT GC AT CATC AG 
TCCTAGTCGGCAGTTCCTTAAGAATGGACTGGCGGCGTGGTTGAGCTGATATGGAA 
AAGCTGCACCTTCCTGCAGAAGATCAACTGACCTGCTATCCCACCCCAAATTTCAA 
CCTGAGGTATATTTCAATGAAGGCAGGTAGCTGTGCTTCTCAGAGCA 
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Sequence ID 433 SEQ ID NO: 95 

TCCCGGAATCGCGGCCGCGTCGACCCGCCGCCGAGGATTCAGCAGCCTCCCCCTTG 
AGCCCCCTCGCTTCCCGACGTTCCGTTCCCCCCTGCCCGCCTTCTCCCGCCACCGC 
CGCCGCCGCCTTCCGCAGGCCGTTTCCACCGAGGAAAAGGAATCGTATCGTATGTC 
CGCTATCCAGAACCTCCACTCTTTCGACCCCTTTGCTGATGCAAGTAAGGGTGATG 
ACCTGCTTCCTGCTGGCACTGAGGATTATATCCATATAAGAATTCAACAGAGAAAC 
GGCAGGAAGACCCTTACTACTGTCCAAGGGATCGCTGATGATTACGATAAAAAGAA 
ACTAGTGAAGGCGTTTAAGAAAAAGTTTGCCTGCAATGGTACTGTAATTGAGCATC 
CGGAATATGGAGAAGTAATTCAGCTACAGGGTGACCAACGCAAGAACATATGCCAG 
TTCCTCGTAGAGATTGGACTGGCTAAGGACGATCAGCTGAAGGTTCATGGGTTTTA 
AGT GC T T G T GGC T C AC T G AAGC T T AAGT GAGGAT T T C C T T GC AAT GAGT AGAAT T T 
CCCTTCCTCCCTTGTCACAGGTTTAAAAACCTCACAGCTTGTATAATGTAACCATT 
TGGGGTCCGCTTTTAACTTGGACTAGTGTAACTNCTTCATGCAATAAACTGAAAAG 
ACCATGCTGCTANTC 

Sequence ID 434 SEQ ID NO: 96 

TTCGGACGCAAGAAGACAGCGACAGCTGTGGCGCACTGCAAACGCGGCAATGGTCT 
CATCAAGGTGAACGGGCGGCCCCTGGAGATGATTGAGCCGCGCACGCTACAGTACA 
AGCTGCTGGAGCCAGTTCTGCTTCTCGGCAAGGAGCGATTTGCTGGTGTAGACATC 
CGTGTCCGTGTAAAGGGTGGTGGTCACGTGGCCCANATTTATGCTATCCGTCAGTC 
CATCTCCAAAGCCCTGGTGGCCTATTACCANAAATATGTGGATGAGGCTTCCAAGA 
AGGAGATCAAAGACATCCTCATCCAGTATGACCGGACCCTGCTGGTAGCTGACCCT 
CGTCGCTGCGAGTCCAAAAAGTTTGGAGGCCCTGGTGCCCGCGCTCGCTACCAGAA 
ATCCTACCGATAAGCCCATCGTGACTCAAAACTCACTTGTATAATAAACAGTTTTT 
GAGGGAT T T TAAAA 

Sequence ID 435 SEQ ID NO: 97 

CTGCAATGTGCAATAGTTGCACCACTGCACTCCAGCCTGGGTGACAGAGTGAGAAC 
C T AT C T C T T AAAAAAAAAAAAAAAAAAAG G AAG AAG AG AC AT GAG AGGGCC C AAG T 
CACTTGCTCACTCACTTTCCGTGTACATGTACCAAGAAAAGGCCATGTGGGAAAGA 
GCAAGAAGGCAGCCGCCTTCAAGACAGGAAGAGAGCCCTCACCAGAAACTGAGCCA 
GAACCTTGGAATTCCAGCCTCCANAACTGTGAGAAAAGAATTTTCTGTTGTTTCAG 
TCCCCCACACTATGGCATTTTGTTACGGCAGCCTGAGCTAATACTCCTACTTTGTC 
CTGCATTTACTTGGTCTTCCAGTTAGTTTTTTAGACTTTGGGAATCAGAGCAGTCA 
GTTGTCAGATTTTAGCTTACAGTTGTCCTACCTGTGCAACTGAAATTTCTTCCATT 
TTAAACCAGAGCAGAGTTTTAGAGTCAAAAGAAACCAGATCTTTTAGTGCAGAAGC 
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Sequence ID 436 SEQ ID NO: 98 

AAAAAAAC T C C AG AG AAG T T T AT AG AAAG AG AT G AC AT G T AAAC C C T GC T G AAAAA 
T AGT T T CAT T T GT T AGAAT AT AAT T GT C T T C C AC T AAAAAAAGAAAAAAAAAAGC A 
TTTAAGGCTCTAAGATCTCTTGAAGTACCACTTTTCCTGAATCCCAGAGTTTTTAT 
GTGCATTATTTTTATGCGTTTGTAGTTTGATATGTTGTATTTATAAGTAGTTTTAG 
CT TTCCATTATGAATTCTTCTTTGACCCATGAGTT AT TTAGGTAAGTGTTT AAAAA 
TTTACAATAGTTTATATATGCAAATATTATGTTGTTAGAGTTGGTTTTCATGTCAT 
TTTTACATATACAGGGGCAGTTTCCCCAACTAAATTGTATATTCCTTAAAGCAGCA 
CTCTTAAATTTTATTTCTGTGTCAATTTCTTGNCTGTGTTTCCTGGCATGGAATAC 
AT GGC AT AAAAT T T GT T AT GT AAT T AAAT GAAAT AT TAT TAT AC T T T C T AT t T T T T 
AGAAAAAA 



Sequence ID 4^- SEQ ID NO: 99 nt : 577 

GTCGACAGGGATGACATAACTATTAGTGGCAGGTTAGTTGTTGGTCACTTTCAACT 
CTGGGTTCAAGCGATTCTCCTACCTCAGCCTCCCGAGTAGCTGGGATTACAGGCAT 
GCACCGCCACACCTAATTTTCTATTCTTAGTAGAGACGGGGTTTCTCCCTGTTGGT 
CAGGC T GGT C T CGAAC T CCCGACC T CAGGT GAT CTGCCTGCCTC AGT CTCCC AAAG 
TCCTGGAACCACAGACATGAGCCACCACGCCTGGCCCCTTTTAAAATATTTCTGCT 
CATTGATGATGCACCCAGTCACCCAAGTGCTCTGATGGAGATGTATAAGGAGATGA 
ATGCTGTTTTCATGGCTGCTAATACAACATTCATTCTGCAACCCCCAAATCAAGAA 
GTAATTTTGACTTTCAAGTCTTATTATTTAAGAAATATATTTTGCAAGACTATAGC 
TGCCATAGACCGTGATTCCTCTGATGGATCAGACAAACTAAAATGAAAACCTCCTG 
CAACGTATTCATCATTCTAGATCCCTGAGGAATCGCCACACTGACTTNCACAATGG 
GTGAACTGGGTTACAGT 

Sequence ID 4-4jr SEQ ID NO: 100 nt : 552 

AAAC AAAAT T AT TCTCTGAGAGGGAAAGGAC AT TTGAGGGAAACATCAAATTTCCC 
CAT AAAT AAAT GAATGGAGTTTGCAGGAAGGTGAGGGTGAGCAGAGATGTGTGTGG 
ACATCTCTGACCATCCATCGCTGTATTCAAATGGATTGTTTTATTCCATTCTGGTC 
TCAGGCATGACCACGTCCAGTGAAGACATTTGAGGCAGCACATCTCAGGACCCAGG 
CAATAGACTGGCCCCAACTCAGGCTGGACTAAGGTGTGATTAATTCTTTGTTTTTT 
G T GT GGAAC AGC T C ACC T T GT C AGAC AGCC T C AGGGC AT C T C T GAGAC AC AGGGGC 
AGAAAATGACATTCATCTTTTGAGTCCTCATCCATGGAGTGCTGTGTTTGGGGGGC 
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TGCATCTGCTGAAGCGAGAACCCCATTCTGCCACCCCACCAGGATGCCCATTCTCC 
AGGACTTCTCCAACTTACTATTAGACTAAACCAGAACAAGCAACAAACTGTATTTA 
T G C A AG C A A A A T T G A T GAG A A A A TTATATTC AAA T A A AG C A A A A A T T A 

Sequence ID 4-4^- SEQ ID NO: 101 nt : 60 6 

TCGTGCCACTGCACTCCAGCCTGGACGACAGAGTGAGACTCCATCTCAAAATAAAT 
AAATAAATAAATAAATAAATAAATAAATAAAAAAATAAAAAATACTTCTGCTATGA 
AAAACCTAGTTGGTATTTTTGCTTATTTAATACTATAGAAATATGGTGATCTCATC 
TTTAATAGAGTGCTTTTAAGGTCCCCAGTGATAATCTCCTAAAATCATGAACTTTA 
AGAATTTATAATGTTAATATGAGGAAATGAAATCTGGATTATCTCACCACATATTA 
T AT AAT T CAT TAG T G AC AG AGC AAG AAC T C C AGG T C AC CTGTCTATTC CAT G T T T T 
TCCTATCTGCCTTT AAAT G T T GAG AT AC T AC C C T TAT C T CAT G T G AAT G G AG AAAC 
T GC C T AAAAT GC T AAAAC T GAC T C AGAGGC AC CC AGAC AT AAGT G AAGT GT GAT T A 
GAAAATCCTGGTCAGTTGAGTCTTAGCCAAATGTGTACCTACTGTGTCTGCCTCTA 
TCAAGTCAATGAAAACATGATCTGAGAACTGTAAGTCCATTTATGGAAAGGGTTGA 
T T T ANAGAT AT T T T GAAC T TNC AGT GAT GAGCCCC T T C T C AAAT AG 

Sequence ID 446 SEQ ID NO: 102 

CGGACTCCTGTGCTAATTGTCAGCTTACATATCATTGTATAGAGACTGTTTATTCT 
G T AC C AAAC T G A T T T C AAAAG T AC T AC A T N G AAAA T AAAC C G G T GAC TGTTTTTCT 
TCATAAAGTTCTGCGTTTGGCATCTTCACTCTTTCCAAAATGTATCTGTACATCAN 
AAATGTCACTATTCCAAGTGTCTTTTTAGTGTGGCTTTAGTATGGCTTCCTTTTAA 
TATTGNACATACATTGNATCTTTGTTTTATGGNAATAAGTAATAAAAATGTAGACT 
TCATATTTTGTACAAAATGTCCTATGTACAGAATAAAAAAGTTCATAGAAACAGCC 
NANAA 

Sequence ID 447 SEQ ID NO: 103 

AGGCCGAGGCAGGCAGATCNCNTGAGGTCAAGAGTTTGAGACCAGCNTAGCTAACA 
TGGTGAAACCCCATCTCTACAAAAATATA-AAAATTAGCCTGG-GTGGTGATGGGC 
ACCTGTAACCCCAGCTACTCGGGAGGCTGAGGTAGGAGAATCACTTGAACCCGGGA 
GATGGAGGTTGCAGTGAGCCAAGATCGTGCCACTGCACTCCAGCCTGTGTGACAGA 
ACAAGACTCTGTCTCAAAAAAAAATAATAATAATAATAATAATAAAAAGGAATAAC 
ATAGCTAGGAATAAATTTAATCAAAGAGGTGAAAGACTTATACACTTAAAACTACA 
AAAAAAAAATCACTGAAGGAATTATAGACCCAAATAAAAATAAATAAAAAGACATT 
CTGTGTTT T AGGGAAAGAAGAC T T AAT AT T GT T AAGAT G T C AAT AC T AC CC AAAG T 
GAT C T AC AGAT T C AAC AT AAT CC C T AT C AAAAT T C C AAC AGC C T AC T T T GT AGAAA 
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TGGAAAAGCCAATTTTCAAATTCAGATGGAATTGCGAGGGGTTNTGAATAACAAAA 
C ACN AT C T T GGGG AAAAAAAAC AAAAAAC AAAG T C AAAG AAC T C AC AC T T C TN T AT 
T T AT AAAT T T AC T AC AAAGT T AT AGT AAT CNAA 

Sequence ID 4-4-8- SEQ ID NO: 104 nt : 32 9 

TACGCACACGAGAACATGCCTCTCGCAAAGGATCTCCTTCATCCCTCTCCAGAAGA 
GGAGAAGAGGAAACACAAGAAGAAACGCCTGGTGCAGAGCCCCAATTCCTACTTCA 
TGGATGTGAAATGCCCAGGATGCTATAAAATCACCACGGTCTTTAGCCATGCACAA 
ACGGTAGTTTTGTGTGTTGGCTGCTCCACTGTCCTCTGCCAGCCTACAGGAGGAAA 
AGCAAGGCTTACAGAAGGATGTTCCTTCAGGAGGAAGCAGCACTAAAAGCACTCTG 
A G T C A AG A T G A G T G G G A A AC C A T C T C A A T A A AC AC A TTTTGGGTT A A A A 

Sequence ID 45Q SEQ ID NO: 105 

GAGCAGTGGCATGATCACACCTTACTGCGGCCTCCAACCCCTGAGCTTAAGTGATT 
CTCCCGCATTATCCTCCTGAGTAGCTGAGACTACAGGTGCATGCCACCATACACTA 
CTAAATTTGGGTCGGGTGGTGGTGGTGATTTTTTAATATTTTTGTAGAGACAGGGT 
CTCACTGTGATGCCCAGGCTGGTCTTGAACTCCTGGGCTCAAGCAGTCACCCACCT 
CAGCCTCCCAAAGCACTGGGATTACAGGTGTGAGCCACCACACTGGCCAGCTTTGT 
TTTGTTTT GAT GAC T AAGC T GC T C T T GC T AAAAGGGC T T C T C T C T GAAC T T C CC T A 
CCTTTCTTCTGTTTCCCTGGGCTAGGGCTCCATGTTGGCAGTCCTACTCCCAATTA 
ACCTGGGGCTGTCTGGTTAACCTTTATAAGATCTGCAGTCATTGGGAGACCCGGGG 
ACCAGGAATATTGTTGTTGAGGGAGCTACCCTGGAAAGTGGATGGGTGGCCAAAGG 

Sequence ID 152 SEQ ID NO: 106 

TTTGGCTTTGCCTCTAGGCATTAGATGTTATCTTTGGAGGCATCCTTCTATGAGCA 
TTCATTTTTGGACCAAGCCTGGATTTACAATTCTATTACTGGCCCAGACTTCATTT 
CTATCCAATTTCATTCCACTGTGCTATAGTTTACAACATATAATTTGACTTATAAA 
T AAT T C C T GAC T AT GGGT T T AAAG AC T G AAAAT GG AT C AAT AGAAAC T T T GAAAAT 
GTTAACATCTTGATTGCTTTTCTCAGTGTAGAAATGGACAATGTTTAGCTTAAAAA 
CTGCATGTTTTTAATGAGATACGGGGTTGAAAGACTTATTCCTGGAATTTATTGTT 
CTGGAGAAAGCCTGTTGC TAT CTGCCATACCTTGGTT TACT TTGTGCAAAATGAGC 
TTCTTTTTAAGTAATGAGCTCTTTCCATGTTCAGCTTAAATTGCTGTCTTAGACAC 
TTCATCAGGGTTCCCTGCTCTGCCTCATTCCCCCTTTTGCTCACTTGCAGCCTTTG 
ACATAATCCTGGGAGGCAATTGGCATCATACATATTTTGCTTTGTAATCTCCTGCT 
T T GAT T C T GAC T GGG AC C CAGC 
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Sequence ID 4^ SEQ ID NO: 107 nt : 74 7 

GGATCTAAGACCAGCCTGGCAGCCACCAGATGGTGATTCTAGTCCTGGCTCAGTCA 
GTAATAGGTCACTGACCCCAGAGAAATCAATTCAGCCTCCCCAGGTCCTTGGATTT 
C T T T C T GT GAAAAT GAAAGC AT AGG T AGGAAT T T C CC AT GGAAC AGC T AGC AGAGG 
AGAAAT AT T AAAAGT C AGGAGAC T C AT GC T AT AGT T T T CAT AC T T CAT T AC AAC AA 
TGTTGTTTAGGACAAGTGAGTTAACCTGTTAGCTTCCTCTATATAAAATGGAAAGT 
CATTAAAAACCTACATAGCAGGGTTCTTGTGAAGATCAAGTGATAATGTAGGAAGC 
ATGTACAAATGTCACATTCTGCCGTCACGTAATGGTCCTCACAGCTTGAGGTAGCA 
TTTAGCATGTGTCATGATTTAGTACAAGGGTTGGCAAACTGTTGCTCTTGGATTAA 
GTCTGGCTCATTGCCTGTTTTTCAAAGAAAAAAATTGTATATGTGTGTATATATGT 
T AT AT AT AGGT AC AC AC AC AT AT GT GC TAT AT AT AGC AT AT AT AC AC AC AT AAT AT 
ATAAACATGTACATATATAGCATTATATATATACCGTGTATAATATCTCCAGTCCT 
CATGACCAGCCATGCTTGTTCATTTACATTTGCATACTCTATGATTGCTTTCATGC 
AACAATGGCAGAGTTGAGTGATTGTTTTGCACAGANACTGTATGGCCCACTAAACC 
TAAAATATTAATCTCTGCC 

Sequence ID 454 SEQ ID NO: 108 

CTCCTGCCGGGCTCGTGGCGGCTTCTGTCCGCTCCGCGGAGGGAAGCGCCTTCCCC 
AC AGG AC AT C AAT GC AAGC T T G AAT AAG AAAAAC AAAT TCTTCCTC C T AAGC C AT G 
GCATATCAGTTATACAGAAATACTACTTTGGGAAACAGTCTTCAGGAGAGCCTAGA 
TGAGCTCATACAGTCTCAACAGATCACCCCCCAACTTGCCCTTCAAGTTCTACTTC 
AGTTTGATAAGGCTATAAATGCAGCACTGGCTCAGAGGGTCAGGAACAGAGTCAAT 
TTCAGGGGCTCTCTAAATACGTACAGATTCTGCGATAATGTGTGGACTTTTGTACT 
GAATGATGTTGAATTCAGAGAGGTGACAGAACTTATTAAAGTGGATAAAGTGAAAA 
TTGTAGCCTGTGATGGTAAAAATACTGGCTCCAATACTACAGAATGAATAGAAAAA 
ATATGACTTTTTTACACCATCTTCTGTTATTCATTGCTTTTGAAGAGAAGCATAGA 
AGAG AC TTTTTATTTATT 

Sequence ID 4^- SEQ ID NO: 109 nt : 682 

TGCCACTGAAGATCCTGGTGTCGCCATGGGCCGCCGCCCCGCCCGTTGTTACCGGT 
ATTGTAAGAACAAGCCGTACCCAAAGTCTCGCTTCTGCCGAGGTGTCCCTGATGCC 
AAGATTCGCATTTTTGACCTGGGGCGGAAAAAGGCAAAAGTGGATGAGTTTCCGCT 
TTGTGGCCACATGGTGTCAGATGAATATGAGCAGCTGTCCTCTGAAGCCCTGGAGG 
CTGCCCGAATTTGTGCCAATAAGTACATGGTAAAAAGTTGTGGCAAAGATGGCTTC 
CATATCCGGGTGCGGCTCCACCCCTTCCACGTCATCCGCATCAACAAGATGTTGTC 
CTGTGCTGGGGCTGACAGGCTCCAAACAGGCATGCGAGGTGCCTTTGGAAAGCCCC 
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AGGGCACTGTGGCCAGGGTTCACATTGGCCAAGTTATCATGTCCATCCGCACCAAG 
CTGCAGAACAAGGAGCATGTGATTGAGGCCCTGCGCAGGGCCAAGTTCAAGTTTCT 
GGCC GC AGAAG AT C C AC AT C T C AAAGAAGT GGGGC T T C ACC AAGT T C AAT GC T GAT 
G AAT T T GAAGAC AT GGT GGC T GAAAAGC GGC T C AT CC C ANAT GGC T GT GGGG T C AA 
GTACATCCCCAATCGTGGCCCTCTGGACAAGTGGCGGCCCTGCACTCATGAAGGCT 
TTCAATGTGC 

Sequence ID 159 SEQ ID NO: 110 

TCCCGGAATCGCGGCCGCGTCGACCTTGTCCTTGAGCGTCAACCTTCTTTCCCTGA 
AGTGGCTGGGGTTCCTGTTTCCTTCTTTGATTGACAACTTGTGTTAACCCTCGCAC 
ATCTCTGGG C C AAT TTTTGCTTGT AAG TCTTTCCG GAG AC C C C T G G AAT T T AAAT C 
ATTAGCACCGCGCCCTTCCCCGAAGAGTCTTCGAAGGGTTGCCGCTTTTCGGTGGC 
GCAGTTCTCGCGAGAAGGTGACTTTCTTTCTCGGTATTTCCTGGTTTCCAGAATCC 
TTAGCGCGAGGCGGAAAAAATATTTCTCCCAGCTTGTGTTGATGCCGCGATTTTGA 
CTGAGACTTCTTCCCACGATTTCTGTTTTTGCTTCTCCAAGGAAAATGGCAGCTCC 
CGAGCAGCCGCTTGCGATATCAAGGGGATGCACGAGCTCCTCCTCGCTTTCCCCGC 
CTCGGGGCGACCGAACCCTTCTGGTCAGGCACCTGCCGGCTGAGCTTACTGCTGAG 
GAGAAAGAGGACTTGCTGAAGTACTTCGGGGCTCAGTCTGTGCGGGTCCTGTCAGA 
T AAG GGGCGACT G A A AC A TACAGCTTTTGCCACATTCCCT A A T G A A A A A G C A G C T N 
T A A AG G C A T T G AC A A AC T N C A T C A AC T G A A AC TTTTAGTCATACTTT A A T C G 

Sequence ID ^&# SEQ ID NO: 111 nt : 536 

CAGAGATCAAAATAGGCCTTACACAGTGCGACGCGAATTTAAAAGATTACCCCATT 
CAGGTGTATGGATTTTGCAGTATTAAAGATGCTGCCTGGAATAGGTCATTATCTTC 
TCCAAGTACTCTGTTAAGTCAATGAGTCACATAGAGTATAAGGTTTATTATCTGCT 
TTTCTTTCATTAAATAAATCTTTATTGAATTTCTACTACATTAAAAAACCAAACCA 
A AAC AAAAC AA AC AAAAA AAAC AC T T C C C T GAG C C A T AAAG GAG AAG G T AG T T T T G 
AC T GGAAC C T T GAAGGAT GGG T AAAC T T T C AGC AG AT AAAGAT T G AGAGAAG AC C T 
TCCAGGTAGAGAAAGCAGTGTGGGCACAGGCAAAGATGGAAGAACACACGTGGCTG 
TGGGAAACACAGCTAGAAGCCAGTGCGGATAGAGAGTAGGCTATGATGTGCAAAGG 
T TAN AC AC T GGGAGAGAC AGGT CC AT GAGAGT AGC T T GGAC T AAC AC AGGGAGGGT 
T T G G AA T C C C A AC T G G G G AAC C T AN AAA T C AA 

Sequence ID 461 SEQ ID NO: 112 

TAGGAGGCTTATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCT 
ACGCCAAAATCCATTTCACTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCA 
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CAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTACCCCGATGC 
ATACACCACATGAAACATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAG 
T AAT A T T AAT AAT T T T C A T GAT T T G AG AAG CCTTCGCTTC G AAG C G AAAAG T C C T A 
ATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTA 
C C AC AC AT T C G AAG AAC C C G T AT AC AT AAAAT 

Sequence ID 162 SEQ ID NO: 113 

TCTTTATCAAGTTGAGAAAGTTCCTCCCCTCTATTCCTAGTTTGCTAAGAGTCCTT 
CTATCCTATTTCTTAATGGTTTAGTAGATGACTCTGTGGTACTTTGAAGGTTGTTT 
GCAGAATTTCCATGCCATAGGCAATTTACCTTTCCTTGACATTTGAAGGATTGATG 
TTGGTGCCAAGTATAGAATCTTCACAGAGTCCTCCTGTAGCTTCTAAAGGTTTAGC 
T T G AAAAT G T T AAT T G C T T AAC G C T AG T AAG T GAG T G AAAAAG C T G G G GAT A AAT T 
TTGTATCTTGCTTATATTTCAGTTCCCACCTCTGTCCNGACNAAACCCCCATATAT 
AA 

Sequence ID 163 SEQ ID NO: 114 

TAGTTTACATATCCCAACCTTTAAAAATATTCCTCTTATTAGCTTTATATTCACTT 
TATAGAAGTTGAGTTTTAATTAAAATTCTTGGCATCCTGAAGTATGTCACATAGCA 
TGTGCTCCT T AT AAAT AT G T T GAT AT C T C AG AAG AC AGC AT CCCGGTTTTCATTTT 
ATAAAGTACCATACTTAAGAATGCTGTAATACTTATCTTTTATAACATGTTTCCTT 
CGCTTTGCTTGNCTTTTATGNCATCAGTTTTAACTGTTTACTTCATTTAACAGNTT 
ACATCATNCAACAGTTTACTTCATTAAACAGTAGGTGGAAAAATAGATGCCAGTCT 
ATGAAAATCTTCCCATCTATATCAAAATACTTTCAAGGATATACTTT 

Sequence ID 4-64- SEQ ID NO: 115 nt : 

615 

CGACTTTCAACCATCAAGTGAGGAATACCTTCACATAACTGAGCCTCCCTCTTTAT 
C T C C T G AC AC AAAAT TAG AAC C T T C AG AAG AT G AT GG T AAAC C T GAG T TAT TAG AA 
GAAAT GGAAGC T T C T CC C AC AGAAC T TAT T GC T GT GGAAGGAAC T GAG AT T C T CCA 
AG AT T T C C AAAAC AAAAC C T AT GG T C AAG T T T C T G G AG AAGC AAT C AAG AT G T T T C 
CCACCATTAAAACACCTGAGGCTGGAACTGTTATTACAACTGCCGATGAAATTGAA 
TTAGAAGGTGCTACACAGTGGCCACACTCTACTTCTGCTTCTGCCACCTATGGGGT 
CGAGGCAGGTGTGGTGCCTTGGCTAAGTCCACAGACTTCTGAGAGGCCCACGCTTT 
CTTCTTCTCCAGAAATAAACCCTGAAACTCAAGCAGCTTTAATCAGAGGGCAGGAT 
TCCACGATAGCAGCATCAGAACAGCAAGTGGCAGCGAGAATTCTTGATTCCAATGA 
TCAGGCAACAGTAAACCCTGTGGAATTTAATACTGAGGGTGCAACACCCCATTTTC 
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CCTTCTGGAGACTTCTAATGAAACANATTTCCTGATTGGCATTAATGAANAGTCA 
Sequence ID 469 

GAT T T T T AAAAAT AC AT AT AGC AAAAAT AT T AC AGGGT C AGGGGAGAC AAT T AGAA 
T G A T A T A A T T C A A AG T G G A T T A A A A A A A A A AC T G T C AC C C AG A A T AC A A T AC C C AG 
CAAAG TTGTCCTT C AT AAAT G AAAG AAAAATN AAAT C T T TNCCNAACNA 

Sequence ID 471 SEQ ID NO: 117 
TCCCGGGAATCTGCAGGATCCGTCGACT 

Sequence ID 472 SEQ ID NO: 118 

GACAGTGCCCAGGGCTCTGATATGTCTNTCACANCTTGNAAAGTGTGAGACAGCTG 
CCTTGTGTGGGACTGAAAGGCAAGATTTGTTCCTGCCCTTCCCTTTGTGACTTGAA 
GAACCCTGACTTTGTTTCTGCAAAGGCACCTGCATGTGTCTGTGTTCTTGTAGGCA 
TAATGTGAGGAGGTGGGGANACCACCCCACCCCCATGTCCACCATGACCCTCTTNC 
CACNCTNACCTGTGCTCCCTCCCCAATCATNTTT 

Sequence ID 4^ SEQ ID NO: 119 nt : 

694 

TGGGCTTTGGGCTGGCTGCAGTCTGTCTGAGGGCGGCCGAAGTGGCTGGCTCATTT 
AAGATGAGGCTTCTGCTGCTTCTCCTAGNGGCGGCGTCTGCGATGGTCCGGAGCGA 
GGCCTCGGCCAATCTGGGCGGCGTGCCCAGCAAGAGATTAAAGATGCAGTACGCCA 
CGGGGCCGCTGCTCAAGTTCCAGATTTGTGTTTCCTGAGGTTATAGGCGGGTGTTT 
GAGGAGTACATGCGGGTTATTAGCCAGCGGTACCCAGACATCCGCATTGAAGGAGA 
GAATTACCTCCCTCAACCAATATATAGACACATAGCATCTTTCCTGTCAGTCTTCA 
AACTAGTATTAATAGGCTTAATAATTGTTGGCAAGGATCCTTTTGCTTTCTTTGGC 
ATGCAAGCTCCTAGCATCTGGCAGTGGGGCCAAGAAAATAAGGTTTATGCATGTAT 
G AT GG T TTTCTTCTT G AGC AAC AT GAT T G AG AAC C AG T G T AT G T C AAC AGG T GC AT 
T T GAGAT AAC T T T AAAT GAT G T AC CTGTGTGGT C T AAGC T GGAAT C T GG T C ACC T T 
C CAT C CAT GC AAC AAC T T G T T C AAAT T C T T G AC AAT G AAAT G AAAC T C AAT G T GC A 
TATGGGATTCAATCCCCACCATCGATCATAGCACCCCCTATCAGCACTGNAAACTC 
TTTTGCATTAAGGGATCATTGC 

Sequence ID 474 SEQ ID NO: 120 

GGCAGCGCGGGGAGCCCGTCGGCGCCGGCGGGCGGGCCGGTTTCGAAGTTGATGCA 
ATCGGTTTAAACATGGCTGAACGCGTGTGTACACGGGACTGACGCAACCCACGTGT 
AACTGTCAGCCGGGCCCTGAGTAATCGCTTAAAGATGTTCCTACGGGCTTGTTGCT 
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GTTGATGTTTTGTTTTGTTTTGTTTTTTGGTCTTTTTTTGTATTATAAAAAATAAT 
CTATTTCTATGAGAAAAGAGGCGTCTGTATATTTTGGGAATCTTTTCCGTTTCAAG 
C AT T AAGAAC AC TTT TAATAAAC TTTTTTTT GAT A A T G G T T A A A A A A A A A A A A A A A 
A 



Sequence ID 475 SEQ ID NO: 121 

CATAATAAAAAACAATCAACAAACAGGGAATGGAAAGAAACTTCCTCAGCATGGTG 
AAGGCCACATATGAAAATCCCACAGCTAACATCATACTCAATGATGAAAGACTGAA 
AGCTTTTCTCCT GAG AT C AGG AAC AAG AC AAAG AT GTCACCTTTTGTCACTTCTAT 
TCAACTCATTATTGGAAGTTTTTGCCAGAGCAATTAGGTAAG 

Sequence ID 4^£ SEQ ID NO: 122 nt : 

476 

CAGAATCTTTTCATAGGCTGAATGTTGCTCCACAATGTGTCCTTTGACTATCTCTG 
GCTAATTATTATTTTAATCTCTTCTCAGCTTTTCCAAGAACATAACGTTAACCAAA 
GATCTTAGGCCATTCACAACTCTTTTGTAAAAATTAATGTGGATGTGAAACGAGGC 
AACAAATCCTGAAGTAGAAAGTTATTCCTGGCCAGGCACGGTGGCTCACGCCTGTA 
ATCCTGGCACTTTGGGAGGCCGAGGTGGGTGGATCATGAGGACAGGAGATCGAGAC 
C AT C C T GGC C AAC AT GAT G AAAC C C C AT C T C T AC T AAAAT AC AAAAAAT T AGC T GG 
GCATGGTGACGCGTGCCTGTAGTCCCAGTTACTCGGGAGGCTGAGGCAGGGGAATT 
GCTTGAACCTCGGAGGTGGGAGGTTGCAGTGTGCCGAGATCACGCTACTGCACTCC 
AGC C T GGC AAC AG AGC AAG AC T C C AT C T 

Sequence ID 477 SEQ ID NO: 123 

AAAC AGAAAGTTTCTTCTAAAGGCATGATTCAGTTAAGTCATTCTTAAGTGTT AAA 
AAATTGTGAAAAATGTGCCTGTAATCCCAACACTTTGGGAGGCCGAGGCAGGCAGA 
TCACGAGGTCAGGAGATCAAGACCATCCTGGCTAACAAGGTGAAACCCCGTCTCTA 
CGAAAAATACCAAAAACATTAGCCGGGCGTGGTTGTGGGCGCCTGTAGTCCCAGCT 
AC T T GAGAGGC T GAGGC AGGAGAAT G 

Sequence ID 178 SEQ ID NO: 124 

TTCTTGGGATATTGATGACTACTGTCTGAGAGGTGCTGTGGGGAGATTTTCAGGAT 
TGTGTGGTCTTTGAGGGGGGTGTTTTTTTAAGACAACATTGACCACTGTCCACTGT 
CCACATGATCATTGTAAAATTGCAATGCCGCATGCTAGTTGGTTACATAAGACATA 
ATTCCAGTGATTGAAGGTGGTTACACTGTATGGTGGTGTGTTCAAGATGGCACTGG 
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CATCTTTGAGCAGAGCCTGGCTATGCAGCATCATTTGAGTTTTTTAAACACCCTAN 
AGGTCTGGTTGTTGTTGCTGTTGTCCTTTCCTGTGAAAGTCACAANANAAGTTACA 
GTCCAGGTGAACCTGGAGTTTATAGGTTGGTTTTGTTTCTGNTATATATATATATA 
TATATATTTTTTTTTTTTTTTAACATTTACCTGTAGTGCTGTAGCTGTTGATACTA 
TCACCTGCATGCTATTTCTAGTGAGTGCTAAATACAGTATGGTCCAATGACAATAA 
CAGCCCATGGTACTGCCAG 

Sequence ID 179 SEQ ID NO: 125 

CATCAGTCTGTTATCCATGCTGACTTTCCGAAGACTTGCAGCTACTGCATTGATAT 
CTTTCCTGCCAATAAGCAAAGTGTTGAACACTTCACAAAATATTTTACTGAGGCAG 
GCTTGAAAGAGCTTTCAGAATATGTTCGGAATCAGCAAACCATCGGAGCTCGTAAG 
G AGC T C C AG AAAG AAC T T C AAG AAC AG AT GTCCCGTGGT GAT C C AT T T AAGG AT AT 
AAT T T TAT AT G T C AAGGAGGAGAT GAAAAAAAAC AAC AT CC C AGAGC C AGT T GT C A 
TCGGAATAGTCTGGTCAAGTGTAATGAGCACTGTGGAATGGAACAAAAAAGAGGAG 
CTTGTAGCAGAGCAAGCCATCAAGCACTTGAAGCAATACAGCCCTCTACTTGCTGC 
CTTTACTACTCAAGGTCAGTCTGAGCTGACTCTGTTACTGAAGATTAGGGAGTATT 
G C T AT G AC AAC AT T CAT T T CAT GAAAGCC T T C C AN AAAA 

Sequence ID 481 SEQ ID NO: 126 

C AC AC T T T CAT GAT AAAAAC AGAAC CTAGGAAT G A A A AG A A A T TAT AGC AAC AT AA 
T AAAGAC CAT AT AT GAGAAGC CC AC AGC T AAC AT AC T GT AT GGT G AAAAAC T GAAA 
GCTCTTCCTCTAAGATCAGGAACAAGGCAAGGATGCCCATTCTTGCCACTTCTATC 
GAACGT AGT AC TGGAAGCCCTAGCC AGAAC AAC TAGGC AAT AGAAAGAAATT AAAG 
GCATCCATNTCAGAAAGGAAGAANCAAAATGCTGTCTGTTTAANATGACA 

Sequence ID 482 SEQ ID NO: 127 

T T T C T AT AN AAAAAAAT T T T T T AAAAT AAT T G T AAAG T TAG AT TTAAAAT T G T AAA 
A T A T AAAA T C A C AAA GGAATGTACCC A A T AAAA T G T AAA TGCNCCAT A A A A A A A A A 
AAAAAAAAAAAAAAAAAA 

Sequence ID 183 SEQ ID NO: 128 

CGNTAACGTGCAATCCGCCGCACGCCAGCAAACTGGACAAACTCCGGGATCTCATC 
GAAGCGATTGAGCACCAGTACCAGAGTAATACCGGACTGATGTAACGAGGCGAGTC 
GCTCATCCAGCTTGCTGACGTGAGGCAACATCCAGGCCATCGAACGGNTCATCAAG 
AATCAACAAGTCAGGCTCCGACATCAGCGCCTGACACAGCAGGGTTTTTCGCGTCT 
CGCCAGTGGAAAGGTATTTAAAGCGTCNGTCGAGGAGGGCGGTAATACCGAACTGC 
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TGCGCCAGTTGCATGCAACGCGGTGCATCCTTTACTTCATCCTGAATGATCTCAGC 
CGTAGTGCGTCCGGTGCCATCTTCGCCAGGGCCGAGCATATCGGTGTTATTCCGCT 
GCCATTCGTCGCTGACGAGTTTTTGCAATTGCTCGAAGGAGAGACGAGTGATGTGG 
GAAAACTGGCTTTGCCGTTCACCTTTCAAAAGCGGGAAGTTCCCCCGCCAGCGCGC 
GGGCCAGGGCCCGAT 

Sequence ID 484 SEQ ID NO: 129 

TTTTTTTTTTTTATTCTAT T AAAAAAT GTTNNT G AAAAAAG AT AC T TAAAT T T T AA 
AGATAACTNAATTCCTAANGATTTAAAATAATCCAAGCAGAGATGAAAGANCAAAT 
GCAAATGCNTAAAAAGACCCCANAGCATTGTTAGCAAAAAGCAAATATAGTTAGCC 
AAGCATATATATNTCATAAAAGCAATAANAAGGCNTAAAGCAAGTTTGGGGAGAGC 
TTATT TAAAAC T T GTAAAAAT CAT T T GAAT T T TTAAAAG T T T T CAAAC 

Sequence ID 4-8-5- SEQ ID NO: 130 nt : 

551 

TTTGGAACACAAAGTTCCCTTTTTAGAAGAATAGGTATTGAGCCCTTGAGCGTGGG 
T AGAAAGAT AGAGAC AGAGT GAT T T GC AAAAT AAT GGAGGAT CAT AT T TAT AT AT G 
AATTTTCACTTATTTGAACTTTCAGATATCANCTTNAAAANCTTTGGTTTAAGTAA 
AGTNTNTT AAT GAG AC T C C T T G GAT G AAAG T AAC C AAAAC C AG T A AAAA T AAG G T A 
ATAAGGATGTAATAGTTTCTTATGGACACTCAACAGCTAGAATGCAGTTAGTCTCA 
G AAAAGAAT T AG AAC AAA T AAC T GG AAGGC C AT C AGGAGT C C AAAAC CAT C AC T C T 
TTTATATTTTATATTTTATTTTTCTCTCTTCANATGAGCATTCTCTTTCTATGTCC 
ATATGGTANAAGGCGGCAGCTCCATAGATTATGGCTTCAGATGTTACAGTTCCGCT 
NAATGCAGGGACAGACTTGCTATCTTTCAGTCCCCTTACATATCCTGGGGAGAGAG 
CAAATGATTGACTGGCTTGAGTCAGGTGCCCGTTCCCTTTCCAATCT 

Sequence ID 4-&^ SEQ ID NO: 131 nt:224 

GTTTGNTTGTGACCATCTGTACTTGTAATTTCTTTACNTTCATTGGTATGAAAAAT 
ATGTTCTT AG A AG C AN G A A A A AG A A TTCAGNTTTGCTTTGTATACT AAA T T AAA T G 
CTGT AAT TTTGAT AAAAT GAAAAATCTGCTTT AT TTGC AAC AAT TGGTTTCTTCCT 
TGACGTCAGCCTCACTCTTGGACTTTGGTATTCAGCCNGNCACCCCTGGGAATTCC 

Sequence ID 4-8-8- SEQ ID NO: 132 nt : 

349 

GTGCCTCCCTGTGTGAGTAGCCTAAGGTGCATTGAAAAAGACTGGGATGTGTTTTA 
TTTTTTTGTATTAGATAGCATTAACCTTACTGTTGAAGTATTTTTGGTGGAGTATT 
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AGTGACAAGCCATTGAGTCTTAAGCCTTACGGCTTCCTATAAAATCACTAATTTCG 
TGTGTGTTTGTGTGTAGGTTACGTTATATATAGGATTCGTGTTCGCCGTGGTGGCC 
GAAAACGCCCAGTTCCTAAGGGTGCAACTTACGGCAAGCCTGTCCATCATGGTGTT 
AACCAGCTAAAGTTTGCTCGAAGCCTTCAGTCCGTTGCAGAGGANCGAGCTGGACN 
CCCTGGGGGGCTC 

Sequence ID 189 SEQ ID NO: 133 

T T AAC AGC T GC AT AGAGT T T T AAAAGT AC AT TAT AT T T T GT C AGAC AAGT AAAAT A 
TCTGTTTTT C A C G C A A A A A A A G C C A T GAAAT ACGT AAT T T T T T AAAGAC AAAAAAT 
CATCTTTTGAGTTTGCTCTTTGGTTTTTCTTCATTCCTTTTGAGGATTGGGAAAAC 
AGAAAGAT T C T T T GAT T T GGGT AAT GAAGAGG T AAT T T GGGAC AG T GT GGT GGT AC 
CAGGAAGAAAGAGGATTGGAAAGGCCAGTACTGTTTTAGTTGCTCGGCACTGTTGG 
TTTTGTTTTAATGTGGTTGCCCTGTCCACTACATGGTTCTATCAGTAGTGTAATCC 
ATTTTCAATGTAAAGCTCTTTTAGTTTTTGTCATAGACATAAATTAATATTTTGAG 
AGGCATCCCTCACCTGTT CAT TTCTTCTGTGTTGAAATGAAGT AC TT AAAAT TACC 
GTTATACATGAACTTTGTGGACTGTAAGATTTGTTATATATGTTCAAATGCCTTTT 
AGCTGGCTTTTTAATTAATATGCCTGTTTTGAGTGCTTAATACAATGTAATGNGGA 
TTGTAAATCATACCTATTTTAAATCATTCCTTCCTGTATATTTGNACTCAGAGAGC 
CTTATTTTATTCTTCCAGC 

Sequence ID 4-»jr SEQ ID NO: 134 nt : 

382 

TTTTCTTAGAACTTTATTTTTTCTGGCCAGGCGCAGTGGCTCACACCTGTAATCCC 
AGCACTTTGGGAGGCCAAGGCAGGTCGATCACCTGAGGTCAGGAGCTCAAGACCAG 
CCTGGCCAACATGGTGAAACCCTGTCTCTACTAAAAATACAAAAATTAGCTGGGCG 
TGGTGGCGCATGCCTGTAATCCCANCTACTCAGGAGGCTGAGGCAGGAGAATTGTT 
TGAACCCGGGAGGCGGAGGTTGCANTGAGCCGAGATTGCGCCACTGCACTCCAGCC 
T GGGC AAC AG AGC G AAAC T C C AT C T C A A A A A A A A A A A A A A A A A AC A AC C T T T AT T T 
T T T C T GAT T T T AAAAGT AAT AAC TAG T T T GT AGAAAC AT TAAAAG T 

Sequence ID 192 SEQ ID NO: 135 

ACCCTAAACATAACTTAAAATTTGTTNGGAATTTGAAAGTACAGAATTTTCCTGTA 
ATTGAGACTNTTTAAACTTTTGTGGTTGGAGAAGGTATTCTATTTTTTGAAAATAT 
CTGTAAGTTTTATCTAAATAGTAAACTCTAAGTATTCTTCCCCTTTACTTACAGCC 
ACC C T GGGAAT C T GAGAC T AG AGAAAAT AAAG TTTGTCTCTTGTT C T AAGGAGGG T 
CTGGTTTAGAAATCTGATTTAGACATAGAAAAATTGCAAGAAGCTTGAGGTGATTG 
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GAAGATACGATTTTGTTATCAAAGNATGTTTCTGTTTTATAGATTTTATTCATCTA 
CAACTCCTTATTAATATATTTAAGAAGTCATTAACCCACCATTGATTACTTGATAT 
AAAAGGAGAANCGGT GGT AAAAGGT GAAAT ANAAT TTT TAAT TTTTTTTTTTT TAA 
G T T T AGG AT t T T T T T T T AAAT T C T AAG AG T T T C T G T CAT T T GGGG AC AAT C AG AA 

Sequence ID 193 SEQ ID NO: 136 

T GGG AAT CAT AAT TNGT T AAC T GAAGC TNAT AAGAT GAGAGC AT T C ANAGAGAAAA 
GAACGGAAAGATTGAATATCAGTTTCCCTTCTTTAAAAAAATTGTGGATATGTGAT 
CTAGCTTCTTGAGCATCACAGTGACTGATTGGCTCGTGGTAATTGATCGCTATGCT 
GACAATCTTATCTCCACCTATGTCATTCAATTTTCTAAGAGGCAAAATCCTTAATC 
AGGAGGAGAGTTTAGCTCTAGCTAAATTTCCCTTGTCCAGCATGCTCCTGCTCCCC 
C AAC T T GT GGAAAC AGC T AAAGGAT T GG AC T AGGAGC ANAAG T T T GGAAT GG T TAA 
AAT G T AGC AAC AT GT GT T T C C T GAAAC AAAAT T C C AC T AT AAT AAAAAAAGC AT T T 
GAATGCTCCCTTGTAATTCTGTTGGAGCTTGTTGCCTTTTTTATGACACAACCATA 
ATCAGTGATAGACAGTAGCATAAAGAAGCAAGAGCAAAGCAATTAAGTAATAATAG 
CACTACAAAAATGTGTGCTGTACTTACCAAACACGACATTTATGAATTATTANATA 
GGAAT AAGGGGATGGT 

Sequence ID 494 SEQ ID NO: 137 

GACCCAGCCATCTAAATAAGTTRTACATGTTGCGTATTTTTTTGTTAGGGACTTAT 
CTTCCGAAGAGGAAAGGTTTATGAAACCTAAAGTAACAATGATAGCTTGGAATCAA 
AATGATAGCATTGTTGGCACAGCTGTGAATGATCATGTCCTCAAAGTGTGGAATTC 
TTACACTGGACAACTGCTTCATAACTTAATGGGACATGCTGATGAAGTATTTGTTC 
TGGAGACACATCCCTTTGATTCCAGAATTATGTTATCTGCAGGACATGATGGCAGC 
AT AT T T AT AT GGGAT AT T AC AAAAGGT ACC AAGAT GAAAC AT TAT T T TAAT AT GGT 
AAGT GAAG T GAGAT G T AC C T T GAT AC AT GC T T GAT AAT T T GT T T AGAGT AT T T GGG 
T TAT GCGGCTTACCCAGAAATTGATCTGCTTGTTTTGGCAGTTTGTTTTTAC AAAT 
CAACATATTCAAAGCCTGCTAAATATTAGACAGCTACATGTATATACGTACATACA 
TGAA 

Sequence ID 195 SEQ ID NO: 138 
TTTC 

Sequence ID 496 SEQ ID NO: 139 

CTCGCTGGCGGGAGGCCACGGGCTTTCCACAGCGCGGGGGAACGGGAGGCTGCAGG 



- 171- 

Marked-Up Copy 

ATGGTCAAGCTGACGGCGGAGCTGATCGAGCAGGCGGCGCAGTACACCAACGCGGT 
GCGCGACCGGGAGCTGGACCTCCGGGGGTGATCTGGACCCTCTGGCATCTCTCAAA 
TCGCTGACTTACCTAAGTATCCTAAGAAATCCGGTAACCAATAAGAAGCATTACAG 
ATTGTATGTGATTTATAAAGTTCCGCAAGTCATAGTACTGGATTTCCAGAAAGTGA 
A AC T AAAA T T T T AA T CCAGGTGCTGGTTTGC C AAC T G AC AAAAAG AAAG G T G G G C C 
ATCTCCAGGGGATGTAAAAGCAATCAAGAATGCCATAGCAAATGCTTNAACTCTGG 
CTGAAGTGGANAGGCTGAANGGGTTGCTGCAGTCTGGTC 

Sequence ID 497 SEQ ID NO: 140 

GAAGACCTCACATCTGAGAGCTCATCTGCGTTGGCATTCTGGAGAACGCCCTTTTG 
T T T G T AAC T GG AT GT AC T GT GG T AAAAG AT T T AC T CGAAGT GAT G AAT T AC AG AG G 
CACAGAAGAACACATACAGGTGAGAAGAAATTTGTTTGTCCAGAATGTTCAAAACG 
CTTTATGANAAGTGACCACCTTGCCAAACATATTAAAACACACCAGAATAAAAAAG 
GTATTCACTCTANCAGTACAGTGCTGGCATCTGTGGAAGCTGCGCGAGATGATACT 
TTGATTACTGCAGGAGGAACAACGCTTATCCTTGCAAATATTCAACAAGGTTCTGT 
TTCAGGGATAGGAACTGTTAATACTTCCGCCACCAGCAATCAAGATATCCTTACCA 
ACACTGAAATACCTTTACAGCTTGTCACAGTTTCTGGAAATGAGACAATGGGAGTA 
AAT AT T AC AC AAAT AC T T AT T CAT T GNGGT T AT T T T T AT AC AGT AGT GAGAAGAAT 
AT T G T T C C T AAG T T C T TAG AT AT C T T T T T T T GG AT G T GC AAAAAT T T T T GG AT T G A 
CAGTAACTTGGGTATACATGACACTGAAATGCCTTACTTTGGATGA 

Sequence ID 499 SEQ ID NO: 141 

TGCCTGCGGGCCAGGACCTCGCCCAGCCCATGTTCATCCAGTCAGCCAACCAGCCC 
TCCGANGGGCAGGCCCCCCAGGTGACCGGCGACTGAGGGCCTGAGCTGGCAAGGCC 
AAGGACACCCAACACAATTTTTGCCATACAGCCCCAGGCAATGGGCACAGCCTTCC 
TCCCCANAGGACCCGGCCGACCTCAGCGCCTCCTGCAGGCTAGGACACTGGTGCAC 
TACACCCCATGCCTGGGGGCCGAGATTCTCCAGCAGAAAGATGCAATATTTTTTGT 
TTCCTTTTTTTCCATTTTTTTCTCTAAGGAATCAATATTTCAATATGTTGAGTGTG 
T GT C C AAT GC T AT GAAAT T AAAAT AT T AAAT AAC AT AT T T AT GGC AT T T T C T T GAA 
GAGTGTGGTTGAAGAAATATTTCTCCTTTTGTTTTTCTTTTTTTTTTGNTTGNTAC 
TGCCACTTCTTTTTAGGAGCAAATCTCCCCAGGGGTGTACGGNATTTCTTGACTCT 
GGGAACAGCTGCTACCCCCAAGACTTGCCACGTTGTTCTGCCCTCAAATGGAATTA 
AGTG 



Sequence ID §-Q-Q- SEQ ID NO: 142 

390 



nt : 
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GGAATATGGTCAGGATCTTCTCCATACTGTCTTCAAGAATGGCAAGGTGACAAAAA 
GCTATTCATTTGATGAAATAAGAAAAAATGCACAGCTGAATATTGAACTGGAAGCA 
GCACATCATTAGGCTTTATGACTGGGTGTGTGTTGTGTGTATGTAATACATAATGT 
TTATTGTACANATGTGTGGGGTTTGTGTTTTATGATACATTACAGCCAAATTATTT 
GTTGGTTNATGGACATACTGCCCTTTCATTTTTTTCTTTTCCAGTGTTTAGGTGAT 
CTCAAATTAAGAAATGCATTTAACCATGTAAAANATGANTGCTAAAGTCAGCTTTT 
TAGGGCCCTTTGCCAATAGGTANTCATTCAATCTGGTATTGATCTTTTCACAAA 

Sequence ID 502 SEQ ID NO: 143 

ACCCGCCATCTTCCAGTAATTCGCCAAAATGACGAACACAAAGGGAAAGAGGAGAG 
GCACCCGATATATGTTCTCTAGGCCTTTTANAAAACATGGAGTTGTTCCTTTGGCC 
AC AT AT AT GCGAAT C T AT AAGAAAGGT GAT AT T GT AGAC AT C AAGGGAAT GGGT AC 
T GT T C AAAAAGGAAT GC C CC AC AAG T GT T AC C AT GGC AAAAC T GG AAGAGT C T AC A 
ATGTTACCCAGCATGCTGTTGGCATTGTTGTAAACAAACAAGTTAAGGGCAAGATT 
CTTGCCAAGAGAATTAATGTGCGTATTGAGCACATTAAGCACTCTAAGAGCCGAGA 
TAGCTTCCT GAAAC G T G T G AAG GAAAAT GAT C AG AAAAAG AAAG AAG C C AAAG AG A 
AAGGTACCTGGGTTCAACTAAAGCGCCAGCCTGCTCCACCCAGAGAAGCACACTTT 
GTGAGAACCAATGGGAAGGAGCCTGAGCTGCTGGAACCTATTCCCTATGAATTCAT 
G G C A T A A TAGGTGTT A A A A A A A A A A A A T AAAG GACCTCTGGG 

Sequence ID §-Q-^ SEQ ID NO: 144 nt : 

109 

ACATTTTCCGGNCCTTTTGCCATACACAGTTACAGAGATCAGTCAAATCCATACCA 
CCACTGAGATCTCATTTATTGCCACAGATGCACAAAATAAATAACCCAAAATC 

Sequence ID — 5 04 SEQ ID NO: 145 nt : 

374 

CCAGCAACGACCCATACCTCAGACCCGACGGCCCGGAGCGGAGCGCGCCCTGCCCT 
GGCGCAGCCAGAGCCGCCGGGTGCCCGCTGCAGTTTCTTGGGACATAGGAGCGCAA 
AGAAGCTACAGCCTGGACTTACCACCACTAAACTGCGAGAGAAGCTAAACGTGTTT 
ATTTTCCCTTAAATTATTTTTGTAATGGTAGCTTTTTCTACATCTTACTCCTGTTG 
ATGCAGCTAAGGTACATTTGTAAAAAGAAAAAAAACCAGACTTTTCANACAAACCC 
TTTGTATTGTANATAAGAGGAAAAGACTGAGCATGCTCACTTTTTTATATTAATTT 
T T AC AGT AT T T GT AAGAAT AAAG CAN CAT T T GAAATCG 



Sequence ID 5Q5 SEQ ID NO: 146 
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GTACAGGAGGTAAATTGGATACCCCATCTAAGGGGATCTGTGAGACCAGGTAGTTA 
TTTGGAATGAAAGAGTAAGATATTAAACCAGCCAGCATGTCAACAGGTGGGTGATA 
GTCTTGTTCT C AC AGAC AAC AGAT GGC C AT CAT C T T AAAAC AAC AT T T AT GT T AAC 
C AGC AGAT AAGGGAC T C C T GC AT T G T C AGT GGAC T T T G AGC C T GAGT T T T T C T AC T 
TGCATAGGTGAAAGTGGACTGCAATGCTAGTATAAATGCCGTATGATGACTAGTAC 
CCCTTAGGGAGCTCCAGTTTGCCTTCCTGGGGAACCACAGACCCCAAGTGTAATTT 
C C T GAG G AC AG C C C G AC T T C T 

Sequence ID 506 SEQ ID NO: 147 

GTTACTGTGAGCCTGTCAGTAGTGGGTACCAATCTTTTGTGACATATTGTCATGCT 
GAGGTGNGACACCTGCTGCACTCATCTGATGTAAAACCATCCCANAGCTGGCGAGA 
GGATGGAGCTGGGTGGAAACTGCTTTGCACTATCGTTTGCTTGGTGTTTGTTTTTA 
ACGCACAACTTGCTTGTACAGTAAACTGTCTTCTGTACTATTTAACTGTAAAATGG 
AAT T T T G AC T GAT T T G T T AC AAT AAT AT AAC T C T GAG AT G T G T G AAAAAAAAAAAA 
AAAAAAAAAAAAA 

Sequence ID 5-&^ SEQ ID NO: 148 nt : 

521 

CTGCGGTGGAGCCGCCACCAAAATGCAGATTTTCGTGAAAACCCTTACGGGGAAGA 
CCATCACCCTCGAGGTTGAACCCTCGGATACGATAGAAAATGTAAAGGCCAAGATC 
CAGGATAAGGAAGGAATTCCTCCTGATCAGCAGAGACTGATCTTTGCTGGCAAGCA 
GCTGGAAGATGGACGTACTTTGTCTGACTACAATATTCAAAAGGAGTCTACTCTTC 
ATCTTGTGTTGAGACTTCGTGGTGGTGCTAAGAAAAGGAAGAAGAAGTCTTACACC 
ACTCCCAAGAAGAATAAGCACAAGAGAAAGAAGGTTAAGCTGGCTGTCCTGAAATA 
TTATAAGGTGGATGAGAATGGCAAAATTAGTCGCCTTCGTCGAGAGTGCCCTTCTG 
ATGAATGTGGTGCTGGGGTGTTTATGGCAAGTCACTTTGACAGACATTATTGTGGC 
AAAT G T T G T C T G AC T T AC T G T T T C AAC AAAC C AG AAG AC AAG T AAC T G T AT GAG T T 
AAT AAAAG AC AT G AAC T 

Sequence ID 508 SEQ ID NO: 149 

AAGCTCATGATTTTAAATGTATTTTTCTAATAAACTATACTCCCATTTAAAAATCA 
CCAATACCTTAATGTTTCAATTATATAAGCTAATTAAAAATAAAGGCTGGGCGTGG 
TGGCTCACTTTGGAAGACCGAGGCAGGCAGATCACCTGAGGTCAGGAGTTCGAGAC 
CAGCCTGCCCAACATGGAGAAACCCCATCTCTACTAAAAATACAAAATTAGCCAGG 
CATGGTGGCACATGCCCGTAATCCCAGCTACTGGGGAAGCTGAGGCAGGAGAATCA 
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CTTGAACCTGGGAGGCAGGGGCTGCAGTGAGCCGAGATCATGCCATTGCACTCCAG 
T C T GGGC AAC AAT AG T GG AAC T C C AT C T C AAAAAT AAT AAAAAAAAT AAAAT AAAA 
AT AAAAT T C AAACC T AAAAT AGAT GC T C T AC T T C AG GAG T GGGC AAAT T AAT C AC C 
TGCATCCTTTTTTTGGGCTTTC 

Sequence ID 5-9-£ SEQ ID NO: 150 nt : 

575 

TTTTTTTCTAAATGGNGATTACTAATATATGTGGAGACTATTAATCTCTTTTCTGT 
TGCCATTAGTTCATTTTTCCCCAAAAGCCAATACATGTTCATTACAAAAATGAATT 
ATAAAATATAAGTTAAAAGAAAAACATAAAACCCTACAATCTTACCCACCCAGACA 
ACTACTATTAATACCTTAGTATTAACATATACACATCATGTATATGTATAAATTTA 
T C T TAAAC AAAAAT AAAAT TAT T C T T T AC AT AT T G T T T T AAAAC C T AT T TAT C T GG 
CCAGGTGCCGTGGCTCACGCTTGTAATCCCAGCACTTTGGGAGGCTGAGGCACGTG 
GATCACCTGAGGTCAGGAATTCGAGACCAGCCCAGCCAACATGGTGAAACCCTGTC 
TCTAATGGTTTAAATACCAAAAAATTAGCTGGGCATGGTGGCACATGCCTGTAATA 
TCAGCTAACATGGGAGGCTGAGGCAGGAGAATCACTTGAACCANGGAGGGGGAGGT 
TGCAGTGAGCCGAAATCACACCACTTCACTGCAGCCTGGGCAACAAAGCAAGACTG 
T C T C A A A A AG A A A A A 

Sequence ID 51Q SEQ ID NO: 151 

C AC T GT C AT T C CC AGGAGGC T T T GGAGT C AGAAC T GGAT T C AAAT T C T GAC TNT AT 
GTTGTGTGACTTGGGCCAATAGCTTCTTTNTGTGCCTCAGTTTCTTTAGCTGTAAA 
TANACGGGTAGGTCACCCCTTACCCCATAGGTTATGGGGAAAGTTACAGAAAATGG 
TCAGCTGGGCNCAGTGGCTCAAGCCTGTGGTCCCAGCNCCTTGGGAGGCCAAGGTG 
AGC AGAT TGCTTGAGCCCAGGAGTTT GAC ACCAGTNTGGCAACGT GAC GAAACCCT 
ATCNCTGTGAAAAATACAAAAAATTAGCCAGGCATGGTGGTGTGTGTCTGTGGTTC 
CAGCTGCTTGAGAGTTTGAAGTGGGAGGATCACCTGAGCCCAGAAGGTCGAGGCTG 
CAGT GAGC TGT GAT CGCGTCACTGCACTCCAGCCTGGC- GAC AGAGT GAGA- CCCC 
T - T T T G A A A A A A A A A A A A A A A A A A T 

Sequence ID 512 SEQ ID NO: 152 

GTGAGCGGTGGTGGTTTATTCTTCCGTGGAGTTAAGGGCTCCGTGGACATCTCAGG 
TCTTCAGGGTCTTCCATCTGGAACTATATAAAGTTCAGAAAACATGTCTCGAAGAT 
ATGACTCCAGGACCACTATATTTTCTCCAGAAGGTCGCTTATACCAAGTTGAATAT 
GCCATGGAAGCTATTGGACATGCAGGCACCTGTTTGGGAATTTTAGCAAATGATGG 
TGTTTTGCT T GC AGC AGAG AGACNC AAC AT C C AC AAGC T T C T T GAT GAAG T C T T T T 
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TTTCTGAAAAAATTTATAAACTCAATGAGGACATGGCTTGCAGTGTGGCAGGCATA 
ACTTCTGATGCTAATGTTCTGACTAATGAACTAAGGCTCATTGCTCAAAGGTATTT 
ATTACAGTATCAGGAGCCAATACCTTGTGAGCAGTTGGTTACAGCGCTGTGTGATA 
T C AAAC AAGC T T AT AC AC AAT T T GGAGG AAAACGT CC C TTTGGTGTTT CAT T GC T G 
TACATTGGCTGGGATAAGCACTATGGCTTTCAGCTCTATCAGAGTGACCCTAGTGG 
AAATTCGGGGGATGGGAAGGCCACATGCATTGGAAATAATANCGCTGCAGCTGTGT 
C AAT G T T G AAAC AAG 

Sequence ID 513 SEQ ID NO: 153 

TTTTTTTTTTATAAACTCCAATCATTTCCAGAGCTACTTAGCTCAGCATCTTTTTT 
TTCCACGCTCTTAAGTTGTGTTTATACATTTTTGATACAGTTAGATTGTTTTTGTC 
ACATTCTTCATTCTATCCTGGGATCCCCCAACCACCTAAGTGGATTTTTTGATAAT 
TTGCATGCTTTAAGGATAACTCTTCATTCTGNAAAGGGCTATGGGTTTTGGCAAAT 
GCAGAGTCATGTATCCAAGATTACAATATCGCACAGAAGAGTTTCATCACTATATA 
AAACTCACCAGTCTTCCTCCTATTCAACCATCTCCATGCCTTCTTCCCAGCCCTAA 
CTCCTTAAAACCACTCATATCTTTACTATTGCTATAGTATTGCCTCTTCCACCATG 
TCATATAAATGGAAACATACAGTATTAGTCTTCTCAAACTAGTTTCTTTTACCTAA 
CAACATGCATTTAAGATTCATAGTGTCTTTTAATGACTTGATAGATTATTTCTTTG 
TAGCTGAATAATATTGCATCTTATAGATGTAACCGTTTGTATATCCATATTTTCTC 
AC AGC C T AT G AC T T GNC T T T T GAT T C T C T G AAC AGGC C AT T C AC AAAGC AG AAG T T 
T T AAT T T T TAT AAAGC T AAT GNAT C AAC T T 



Sequence ID 515 SEQ ID NO: 154 

CCTGGATGACAGCATATCTGTTTATAGCTCAGTTTACTGAATACTTTAAGCCCACT 
GTTGAAACCTGCT 

Sequence ID — 518 SEQ ID NO: 155 nt : 

502 

GATGCATGTCCAGCATAGGCAGGATTGCTCGGTGGTGAGAAGGTTAGGTCCGGCTC 
AG AC T G A A T A AG A AG AG A T A A A A TTTGCCTT A A A AC TTACCTGGCAGTGGCTTTGC 
TGCACGGTCTGAAACCACCTGTTCCCACCCTCTTGACCGAAATTTCCTTGTGACAC 
AGAGAAGGGCAAAGGTCTGAGCCCAGAGTTGACGGAGGGAGTATTTCAGGGTTCAC 
TTCAGGGGCTCCCAAAGCGACAAGATCGTTAGGGAGAGAGGCCCAGGGTGGGGACT 
GGGAAT T T AAGGAGAGC T GGGAACGGAT CCC T T AGGT T C AGGAAGC T T C T GT GC AA 
GCTGCGAGGATGGCTTGGGCCGAAGGGTTGCTCTGCCCGCCGCGCTAGCTGTGAGC 
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TGAGCAAAGCCCTGGGCTCACAGCACCCCAAAAGCCTGTGGCTTCAGTCCTGCGTC 
TGCACCACACATTCAAAAGGATCGTTTTGTTTTGTTTTTAAAGAAAGGTGANAT 

Sequence ID 519 SEQ ID NO: 156 

C T GC GATNGAG T T T T GAG AGGAAGGANT AAAG TNC T CAT C T CNGACGGT GAGAAAG 
ATCATNACTAAGGAAACGCAGGGTTGGAAGCAGTGCTGANTGTCCAGTTGAGTTTC 
ATGANCAAACATTTGCTGTGGGACCAGTTTTCATGGNGGTTTGTCATTTTGTCCAG 
CTGCCTGGAGCTGCTTGGTTGAAGGCACAGAATAATCAGGATTAATTGTTNAACTT 
GTATGAATTTCTTTATTTTAAAATAGGAATAATATCTGCCTTGGGAGCAAGTTGTA 
AGAGTTAACTGAAAGCTTNAGGAAAAACTTTCCCTTGCTATTTAAGTAGGGCTTTA 
CAAGTTACAAT TCTAT C AC AGT T T T AAGAT T AT AAAC 

Sequence ID 521 SEQ ID NO: 157 

GCGGCGCANCTGCGGATCCANAAGGNCATAAACGANCNGAACCTGCCCAANNCGTG 
TGATATCACCTTCTNAGATCCAGACNACCTCCTCAACTTCAAGCTGGTCATCTGTC 
C T GAT GAG G G C T T CN AC A AG AG T G G G AAG T T T G T C T C AAAAAA 

Sequence ID ^^- SEQ ID NO: 158 nt : 

585 

GATTTACTGTGGGAATTTGCTCATGCAATTATGGAAACCTAGAAGTCCCATAATAT 
GCCATCTT C AAGC T GGAAT C C C AGGAAAGC AGGT GGT GT AAT T C T GAGAT T G AAG T 
CTTGAGAACCGGGGGAGTCAATGGTGTAACTCCCAATCTAGGGCTTAAGGCCCAAG 
GACCAGGGCTGCTGGTGTGCAGATGCAAATCCTGGAGTTCAAAGGATTGAGAACCA 
GGAGCTCTGGTGTCTGAGGGCAGTAGAAGATGGATGTTCCAGCTCAAGAAGGGAAA 
GTAAGAATCCGTCCTTCCTCCACTTTTTTGTTCTATTCAGATGAGCCCTCAATGGA 
CTGAACGATGCTCACCCACACTGTGAGGGCTGGTCTTCTTTATTCAATCCACTGAC 
T T AAGT GC T GAT C T C T T C T GG AAAC AC C T T C AC AG AC AC AC C C AG AAAT AAT GT T C 
TACCAGCCATGGGCCTGTTACTTAGCCCAGTCAAGTTGACACAGAAAATTAGCTAT 
CACAACATCTGTGTGTGTATATACATATGTATTTGCATGTGTGTGTATATATGGNG 
TATATATATTCATGTGTGTGTATAT 

Sequence ID 521 SEQ ID NO: 159 

CTTTTGCCAGTAGGCCCCCTGAGTAGGTTCCTCTATCTTTTGGCATGACCCCAGAA 
GTCTTTGATAACTTCCTTGCTTTCTGATGTGACAAGACATCCAGGGCCAGATTGTC 
CATATCCTGCCCCGGATGCACGATGCACTGTTTCTCCAAGAATCCCTGTGTCCTTT 
GCTGATGATGCCATGATTTTAAGTTCTCTAATATAGTTTTATCTCTTTGTTTCAGA 
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TAATGCTTTTGTGTTCTCACATGTCCTGCTCTCTCTCTCTCTCTCATTTTGGTGTT 
GATCAGTCTTTCCATAAGATTGTTTATTTCACTAGTCCTTCATTCTTCTTTTTTCT 
AAATTTACTCTTCTTGACTAGTATCCTGTCACTTCTGAGGACTCATATTTTTGCAA 
CTTGAAAATTATTCTTATTTATTTAAGTATATGTTNCTGAAACTCTCATTAGACAC 
ATTTTG 

Sequence ID 525 SEQ ID NO: 160 

G T T A A A A A A AG T A A A AG G A AC T C G G C AAA TCTTACCCCGCCTGTTTACC A AAA AC A 
TCACCTGGTAGCATCACCAGTATTAGAGGCACCGCCTGCCCAGTGACACATGTTTA 
ACGGCCGCGGTACCCTAACCGTGCAAAGGTAGCATAATCACTTGTTCCTTAAATAG 
GGACCTGTATGAATGGCTCCACNAGGGTTCANCTGTCTCTTACTTTTAACCAGTGA 
A A TTGACCTGCCCGT G A A GAG GCGGGCAT A AC AC AG C T G A A A A A A A A A A A A A A A A A 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAATTTT 

Sequence ID ^r& SEQ ID NO: 161 nt : 

516 

CTTTTCATGGTCTCTTGTTCATTAATCATCTAAAATCCAAGCNCAGAGAATTCAAT 
T T T AG AT GG T C T C C AG AGC AG AAT T T GAT G T AT AAT C T T AAT T AC AAAT CAT AG AT 
AAT T AAT AT T GNT T AC AAAAT C ANAAT ACGAT T AGAGGT AGGGAT CC T GC AC AC AC 
CCTATTTTCCTCCCCAGTGTTCTGACCGAGAGACTAATTAATAATTCAAGGAACTT 
ACAGTGAATGANAACCCATGGTTTTGCTTAATTATCAGAACAGCTAGATCTGAGAA 
CAGCTGTCTCCCACATGGATAGACACTTATTCCACCCATTTGCAGGTAGAATAGCT 
GGCAATAATAAGTCCTTCCCATTGGATATGTTGAAAGGTGCCTGCCATGGCATAGT 
T GCC AC AAGAG AGGAAGAAAT GGAC AC AAAT GT AGGC T GT T T T C AGGGC ANAGGGA 
AGGTGGGAGGAAACCAANTTGCTGGTTTTCACACACCCTCTGGGGAACACCCATGC 
ACCTATGANATG 

Sequence ID 527 SEQ ID NO: 162 

GACAAAAGCTGAGAGAATTTTTTTCTTGAATATTTGCACTAAAAGATAGGTTAAAA 
TTCTTCAGGCTGAAGAGAGCATACCAGGTGGAGATTTGGATCTACAAAAAGGAAGG 
AAGAT T T GGAAAT GGAT T T GGC ACC AT T GAC T C AAT T T CC AGAAC AAGAAAGC AGG 
GACAGTTTTGGGAAGCTCAAGACACACTGCCCATGAGCAGCAATTTGGACCTCCTG 
C T GC AT C C AC T G T GC AT C AAAC AC AC AC T G T AC AG AC AAAG AC T C C C AGG AAAAG A 
AGT AT AAAC AT GGAC T AAC AC AGAGAT GGGC AAAC T AC AGC C T GT GAC C C AGCC AC 
C T GT T T AT GT AGAAT CC AAAGT AAGAAT C T T T AAC T T AC AC AT AAAC T T 
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Sequence 529 SEQ ID NO: 163 ; 660nt 

GACAGCAGAGCACACAAGCTTNTAGGACAAGAGCCAGGAAGAAACCACCGGAAGGA 
ACCATCTCACTGTGTGTAAACATGACTTCCAAGCTGGCCGTGGCTCTCTTGGCAGC 
CTTCCTGATTTCTGCAGCTCTGTGTGAAGGTGCAGTTTTGCCAAGGAGTGCTAAAG 
AACTTAGATGTCAGTGCATAAAGACATACTCCAAACCTTTCCACCCCAAATTTATC 
AAAGAACTGAGAGTGATTGAGAGTGGACCACACTGCGCCAACACAGAAATTATTGT 
AAAGCTTTCTGATGGAAGANAGCTCTGTCTGGACCCCAAGGAAAACTGGGTGCANA 
GGGTTGTGGANAAGTTTTTGAAGAGGGCTGAGAATTCATAAAAAAATTCATTCTCT 
GTGGTATCCAAGAATCAGTGAAGATGCCAGTGAAACTTCAAGCAAATCTACTTCAA 
CACTTCATGTATTGTGTGGGTCTGTTGTAGGGTTGCCAGATGCAATACAAGATTCC 
T GGT T AAAT T T GAAT T T C AGT AAAC AAT GAAT AGT T T T T C AT T GT AC C AT GAAAT A 
T CC AGAAC AT AC T T AT AT GT AAAGT AT T AT T T AT T T GAAT C T AC AAAAAAC AAC AA 
ATAATTTTTAGATATAAGGATTTTCCTGGATATTGCACGGGAGA 

Sequence ID 529 SEQ ID NO: 164 

GACAGCAGAGCACACAAGCTTNTAGGACAAGAGCCAGGAAGAAACCACCGGAAGGA 
ACCATCTCACTGTGTGTAAACATGACTTCCAAGCTGGCCGTGGCTCTCTTGGCAGC 
C T T C C T GAT T T C T G C AG CTCTGTGT G AAG G T G C AG T T T T G C C AAG GAG T G C T AAAG 
AAC T T AGAT GT C AGT GC AT AAAGAC AT AC T C C AAACC T T T C C AC C CC AAAT T TAT C 
AAAG AAC T GAG AG T GAT T GAG AG T G G AC C AC AC T G C GC C AAC AC AG AAAT T AT T G T 
AAAGCTTTCTGATGGAAGANAGCTCTGTCTGGACCCCAAGGAAAACTGGGTGCANA 
GGGTTGTGGANAAGTTTTTGAAGAGGGCTGAGAATTCATAAAAAAATTCATTCTCT 
GTGGTATCCAAGAATCAGTGAAGATGCCAGTGAAACTTCAAGCAAATCTACTTCAA 
CACTTCATGTATTGTGTGGGTCTGTTGTAGGGTTGCCAGATGCAATACAAGATTCC 
TGGTTAAATTTGAATTTCAGTAAACAATGAATAGTTTTTCATTGT 

Sequence ID — 530 SEQ ID NO: 165 nt : 

660 

GACAGCAGAGCACACAAGCTTNTAGGACAAGAGCCAGGAAGAAACCACCGGAAGGA 
ACCATCTCACTGTGTGTAAACATGACTTCCAAGCTGGCCGTGGCTCTCTTGGCAGC 
CTTCCTGATTTCTGCAGCTCTGTGTGAAGGTGCAGTTTTGCCAAGGAGTGCTAAAG 
AACTTAGATGTCAGTGCATAAAGACATACTCCAAACCTTTCCACCCCAAATTTATC 
AAAGAACTGAGAGTGATTGAGAGTGGACCACACTGCGCCAACACAGAAATTATTGT 
AAAGC T T T C T G AT GGAAG ANAGC T C T GT C T GGAC C CC AAGGAAAAC T GGGT GC AN A 
GGGTTGTGGANAAGTTTTTGAAGAGGGCTGAGAATTCATAAAAAAATTCATTCTCT 
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GTGGTATCCAAGAATCAGTGAAGATGCCAGTGAAACTTCAAGCAAATCTACTTCAA 
CACTTCATGTATTGTGTGGGTCTGTTGTAGGGTTGCCAGATGCAATACAAGATTCC 
T GG T T AAAT T T G AAT T T C AG T AAAC AAT G AAT AG TTTTTCATTGT AC CAT G AAAT A 
T CC AGAAC AT AC T T AT AT GT AAAGT AT T AT T T AT T T GAAT C T AC AAAAAAC AAC AA 
AT AAT T T T T AGAT AT AAGGAT T T T CC T GGAT AT T GC ACGGGAGA 

Sequence ID 532 SEQ ID NO: 166 

GAATTGTGATAGTTCAGCTTGAATGTCTCTTAGAGGGTGGGCTTTTGTTGATGAGG 
GAGGGGAAACTTTTTTTTTTTCTATAGACTTTTTTCANATAACATCTTCTGAGTCA 
TAACCAGCCTGGCAGTATGATGGCCTANATGCAGAGAAAACAGCTCCTTGGTGAAT 
TGATAAGTAAAGGCAGAAAAGATTATATGTCATACCTCCATTGGGGAATAAGCATA 
ACC C T GAG AT T C T T AC T AC T GAT GAGAAC AT TAT C T GC AT AT GC C AAAAAAT T T T A 
AG C AAAT GAAAGC T ACC AAT T T AAAGT T AC G GAAT C T ACC AT T T T AAAGT T AAT T G 
C T T GT C AAGC T AT AACC AC AAAAAT AAT GAAT T GAT GAGAAAT AC AAT GAAGAGGC 
AATGTCCATCTCAAAATACTGCTTTTACAAAAGCAGAATAAAAGCGAAAAGAAATG 
AAAATGTTACACTACATTAATCCTGGAATAAAAGAAGCCGAAATAAATGAGAGATG 
AGTTGGGATCAAGTGGGATTGANGANGCTGTGCTGTGT 

Sequence ID 533 SEQ ID NO: 167 

CTTGAACCTCGGAGGCAGAGGTTGCAGTGAGCCGAGATCACGCCACTGCACTCCAG 
CCTCGGGGACAGAGCAAGACTCCATCTCAAAACACACACACACACACACACACACA 
CACACACACACACAAAACAGATATACACTGAACACAGCACAAGTGGGACATAAGAG 
ATTTAAAAGGGTTAGAGATGTAAAATGGATCTAGGAATGGAAACCATAAGGNGGGA 
TTTATCAACTGGATTCTGCANAATGCTGTTAAGGCCAGATGTTAGCAGGTGTTACA 
TAAAAAAGGGATACCATGAGCAAAAGTATTTGAACATGGGCAATGGTTGAAACAAG 
TTTAAACAGATTATNTTTATTACCAAATCTCTCAAACCTTTAATATGCTATAAACA 
T T G T G AAAC AAT AAAAAAAC T T T C CAAAA 

Sequence ID 534 SEQ ID NO: 168 

GGGAAGGGAGCTATGAGTGTGTGTGTTGTGTATGGACTCACTCCCAGGTTCACCTG 
GCCACAGGTGCACCCTTCCCACACCCTTTACATTCCCCAGAGCCAAGGGAGTTTAA 
GTTTGCAGTTACAGGCCAGTTCTCCAGCTCTCCATCTTANAGAGACAGGTCACCTT 
GCAGGCCTGCTTGCAGGAAAT GAAT CCAGCAGCC AAC TCGAATCCCCCTAGGGCTC 
AGGCACTGAGGGCCTGGGGACAGTGGAGCATATGGGTGGGAGACAGATGGAGGGTA 
CCCTATTTACAACTGAGTCAGCCAAGCCACTGATGGGAATATACAGATTTAGGTGC 
TAAACCGTTTATTTTCCACGGATGAGTCACAATCTGAAGAATCAAACTTCCATCCT 
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GAAAATCTATATGTTTCAAAACCACTTGCCATCCTGTTAGATTGCCAGTTCCTGGG 
AC C AG G C C T CAN AC T G T GAAAG T A 

Sequence ID 560 SEQ ID NO: 169 

GGCGGAGGTTGCAGTGAGCTGAGATGGCGCCATTGCTCTCCCAGCCTGGGTGACAA 
GAGCAAAAC T C C G T C T C A A A A A A A A A A A A A A A A A A A A A A G C A A T T T AC T TAAAAAC 
AT AC AAAC AC AGAGAC AAGT AT T T T T GAGAAAC AAAT ACC T T T T T CAT T T T T T AT A 
CCAATGTAACAATAATCCATTAAACACACCTTTACTAACTGTTTTCTAGGAGTCTG 
ATATGATGAGGAAATAGGTAAACCTTTAATAGCCAGTACTAAATTAGAGTGGCACA 
ACTTTCACTGGGAAAAAAGATGGGTATTTTACTTTTCTGTTTTAGAAAAGTGGCTT 
GAC AAC AG T AT GC T TAT G T C T T AGAGT T T GAAAT T C AAG T T C T T G AAC AT TAT T AA 
TGGCTACAATCATTCATACCCACATTGGGCTGTATTCTTGATGAATCCAAAGTGAT 
T T T C ACC T C AAC T C T GAAT T T CAT TCTCCTCTTTT GAAT AT AAT AC AACC AT C T C A 
CTAGAGGAAGCATTTCAGTCTTTTCTGATTGGAGATTCATTATTGTTTTAGATAAT 
GTTTTCATTTGCTTATGGGTATATAAAAAATTTTATCTTAAAAATATTTCCTCTCA 
TTTAGCTAGCAACATTGTTTTC 

Sequence ID 561 SEQ ID NO: 170 

CTCAGGGT GAT CTCTGAACCC AAAC TTGCCCCAAAGAAGGTTGCTCTGTCCTCTCC 
ACATCCCCATCTCCTCCCTAGGGCCTTGTTGGGGAGAGGCTCCTCCATCTTTCCCA 
AG T C AC AC CAT C G T T T C C T AC G T GG T C T GG AC AAG AGC AAG AGC AC AC CTTGTCCC 
CACCTTCTCCAGAGCAGCCAGAACCCACCTCAGGTGCCTTCCCCATCCGGTGCAGT 
TAAGGCACTTCTGCCAGCACCATGGTATGAGCACTAGACTTGGAGTTAAGATTTGA 
GAGCCCCCTCTGTCACTGTGGAAGCTTGAGCATGTTGCTTGATCTCTCTGAACCTT 
GTGTTTCTCATCTGTGAAAGGTGATAATGTGGGGCTGCTGTGAGATTTAAAGGACA 
TAATGCACCTACGGTCCAAGCACTGCCTGGAATACAGCANAAGCTCAACAGATACT 
GGACAACCCATCCCCTTAGTAGAGGCACTAACCATGTGACCCAAGGCAAAAGTGCT 
TAAAAAAA 

Sequence ID 5-&S- SEQ ID NO: 171 nt : 

580 

ATTGCATGCAAGTTTGCTGAGCTGAAGGAAAAGATTGATCGCCGTTCTGGTAAAAA 
GCTGGAAGATGGCCCTAAATTCTTGAAGTCTGGTGATGCTGCCATTGTTGATATGG 
TTCCTGGCAAGCCCATGTGTGTTGAGAGCTTCTCAGACTATCCACCTTTGGGTCGC 
TTTGCTGTTCGTGATATGAGACAGACAGTTGCGGTGGGTGTCATCAAAGCACTGGA 
CAAGAAGGCTGCTGGAGCTGGCAAGGTC ACC AAGT CTGCCCAGAAAGCTCAGAAGG 
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CTAAATGAATATTATCCCTAATACCTGCCACCCCACTCTTAATCAGTGGTGGAAGA 
ACGGTCTCAGAACTGTTTGTTTCAATTGGCCATTTAAGTTTAGTAGTAAAAGACTG 
GTTAATGATAACAATGCATCGTAAAACCTTCAGAAGGAAAGGAGAATGTTTTGTGG 
ACCACTTTGGTTTTCTTTTTTGCGTGTGGCAGTTTTAAGTTATTAGTTTTTAAAAT 
CAGTACTTTTT AAT GGAAAC AAC T T G AC CAAAAAT T T G T C AC AG AAT T T T GAG AC C 
C AT T AAAAAAG T TAAAT GAG 

Sequence ID 563 SEQ ID NO: 172 

GCAACCTGCACAACCCCGCCCTGTTCGAGGGCCGGAGCCCTGCCGTGTGGGAGCTG 
GCCGAGGAGTATCTGGACATCGTGCGGGAGCACCCCTGCCCCCTGTCCTACGTCCG 
GGCCCACCTCTTCAAGCTGTGGCACCACACGCTGCAGGTGCACCAGGAGCTGCGAG 
AGGAGCTGGCCAAGGTGAANACCCTGGAGGGCATCGCTGCTGTGAGCCAGGAGCTG 
AAGCTGCGGTGTCAGGAGGAGATATCCAGGCAGGAGGGAGCGAAGCCCACCGGCGA 
CTTGCCCTTCCACTGGATCTGCCAGCCCTACATCCGGCCGGGGCCCAGGGAGGGGA 
GCAAGGAGAAGGCAGGTGCGCGCAGCAAGCGGGCCCTGGAGGAAGAGGAGGGTGGC 
ACGGAGGTCCTGTCCAAGAACAAGCAAAAGAAGCAGCTGAGGAACCCCCACAAGAC 
CTTCGACCCCTCTCTGAACCAAAATATGCAAAGTGTGACCAGTGTGGAAACCCAAA 
GGGCAACAGATGTGTGTTCAGCCTGTGCCGCGGNTTG 

Sequence ID §^- SEQ ID NO: 173 nt : 

671 

GGAATAGAATTTTAAATAGTAATAACTGCTTGTTTTTTTTGTGCAAGTACTTTTAT 
ACATAAGATAAACAAAAACCTTACCACCAAACATACCAAAATGCACCTCTTTCATA 
AGTGAGTTACTAAGATTTCTATACCTGGAATATCATGTATGTTTCATTTACTGGAT 
GTTTACAT TTTAGGAAGGAAAATAGTTT TGTTTAT TTAAAC AAC TGAATACT TATA 
AACTGTTGTTCCTGGAAGTTATTTATTCCATAAAAAATTTGTTCTTTTGTCATGAA 
TTTATAATTCC TAAAT GAAGACCAGAAAGTACAAATTGCTGGGAGGAAGAATAGGC 
T T TAT T AAT C AAC T GAT G T C T T GAT T T T T C TAAAT GGGAAGAT T GC T T TAT T T T T A 
AC AC T AAT T AT GGGAGC AGAT T C T T AGC AAAC T T C T T T GGAAAAG T T AAT GT T AT G 
ATGTGCATTAGGCTGCCCCATCGTGTATATAAATGAAGCAGATTTGATTTTTGTAT 
TCTTACGTTTCTCTGCTTTGTAGTTGTGGCTGTACTTAAAGAAATACAGAATTTCA 
TATATTTAAAAATGTTTAAAATGTGACCCACAGACATTGTAAATGGATTNAAAACT 
AAC AT G AAAAAT AT T C AAC C T AAAAG AAT T C T T AAC T T C AC AAG TGTTTTACTTC 



Sequence ID 565 SEQ ID NO: 174 

CTTGGTTCCGCGTTCCCTGCACAAAATGCCCGGCGAAGCCACAGAAACCGTCCCTG 
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CTACAGAGCAGGAGTTGCCGCAGCCCCAGGCTGAGACAGGGTCTGGAACAGAATCT 
GACAGTGATGAATCAGTACCAGAGCTTGAAGAACAGGATTCCACCCAGGCAACCAC 
AC AAC AAGCC C AGC T GGC GGC AGC AGC T GAAAT C GAT G AAGAAC C AGT C AGT AAAG 
CAAAACAGAGTCGGAGTGAAAAGAAGGCACGGAAGGCTATGTCCAAACTGGGTCTT 
CGGCAGGTTACAGGAGTTACTAGAGTCACTATCCGGAAATCTAAGAATATCCTCTT 
TGTCATCACAAAACCAGATGTCTACAAGAGCCCTGCTTCAGATACTTACATAGTTT 
TTGGGGAAGCCAAGATCGAAGATTTATCCCAGCAAGCACAACTAGCAGCTGCTGAG 
AAAT T C AAAG T T CAAGGT GAAGCT G T C T C AAAC AT T C AAG AAAAC AC AC AG AC T C C 
AACTGTACAAGAGGAGAGTGAAGAGGAAGAGGTCGATGAAACAGGTGTAGAAGTTA 
AGGACATAGAATTTGGTCATTGTCACAAAGCAAATGTGTCGAGAGCA 

Sequence ID 566 SEQ ID NO: 175 

G T C ACC AAGAGC T T G T T G T C AGGT T T T C AC T T GC T AT T CGC AGAG AT t T T T T T T AA 
AGGCACTATTTGTAGTGTTAAAAGGGTGAATTTATCANAAGGCATAATAATCATAA 
ATGTGTATATGCCTAATAATAGAACTTTAAAAGGCATGAAGCAACACTCAAAAGGA 
TTAAAGGGAGATCATCTCACCCCCTTCTTACCAATTGATAGAATGATCTGATGAAA 
ACAGTAAAATAACAACAGATCTGAACACTGTCAACCATCTTGACAAATACTTATGC 
CTAGTGTTCCATTATTGGAACACTAAACATGTGGAATGATTTATATCCTACTGCTC 
AAGG T CAT C AC C AAGG T C T AAT T G T AAAAT T T C AAAAAAT T GC AAC C T C AGGC AT A 
AATGGGTTAATCGACATTTATAGCACACACATGCAACATGTACCAGAGATTCCTTC 
T T T T C T AT GAAC AT GGT AC T T CC AC C AAGAT AGAC C AC AT T G T GAAC T AT AAAAC A 
AATCTAAAAACATTTGAAATGAAGGAAATTATATAAAATATGTTCTCTTGATCTCA 
AT GAAAT T AAAT T AAT AC TAT AT 

Sequence ID 567 SEQ ID NO: 176 

CTCATGGCGGCCAATGTAGGCCCAAAACTTCCTCAAGTCAAACTCTCCAGGCCCAC 
CTTCTGCTTCCCGGTGGCATCAACAGGCCCAGCTTTGACTTGAGAACAGCCTCTGC 
AGGCCCTGCTCTTGCCTCCCAGGGGCTTTTTCCAGGCCCAGCTCTTGCCTCATGGC 
AGC TGCCCCAGGCC AAAT TTCTGCCTGCCTGCCAGCAGCCTCAACAGGCAC AGC TC 
CTCCCTCACAGTGGCCCATTTAGGCCCAACTCATGACTGTGAGGCCATTTCCAGGC 
CTAGTGCCTGCCTCGTGGCTGACTCTTGAAGCCCAAAACTTCCTCAAATCAGCCTT 
TTGCCCAACTTCTGTCTACTGTCGGACTCTACAGGTCAGCCTCTGCCTCACAGTGG 
ACCCTCCAGACCCAGATGGTGTCTNCTGTGGCATCCTCAGGCGAAGCTCCTGCCTT 
TCGGCAGCCTCTCCAGGCCCAGCTCCTCCTGCTCCAGCCTTCTCTCCAGGCTCTGA 
ACTTTCTCAGGTCTCCCTCTGTTGTCCAAGGCTGGAGTGTAGTAG 
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Sequence ID 568 SEQ ID NO: 177 

TATATATGTAATGCCCTTAACCTAGTGTTTGGCATGATCGTTGCTGAAAGGGAAGC 
TTGTGGGTACAGTGTCCCCTCAGAAGCCAAAGCCCAGGGAAGGTCGCCTGCCCAGG 
TCAGGCTCCCAGCGAGTTTGTCTGGGGAGGGGCCATTCATACCTCCAGGTCAGGAC 
AGAGGCTCGGGCTGAGGGAACCCTACACAGGTCCTGGAAGCAGATCCTTCCTGCCT 
AAGCCAGCAGGACAGCTCAACAGGAAGCATCTTCCAGCCACGGGAGGAGAGGCAGC 
ACCTTTTTTGGAACCATACAGAGCTAAGAATGGTGGTACAAGTAATAGATTCTGTA 
CTGGCAACCCCACTTGGTGGAGCAAGTTCTAGGAAAAGGGGGCTGTCCTTGAGTCA 
GCCATGGGGTCAGCCACACAGTCACCGCAGCTGCTCTTTGGCACCGGGCGCTGGAA 
AGACCTAGGATGACACAGCCTGGAAAGAGCTTGGGAAAAGCTCATCTTCCACAGAA 
CTACCTGCTATACCAGCCAGGGCAGGTGCTTATTCCCACAACAGCCCTCTGTTGTA 
GGCGGCAGTGCCATCCTGAANGTGCCGTGGTACCTTCTGAANACCCAGCTGAGGGC 
CTGTAATGGCACTTGCATGCCACATGGNACACCCTTTCCCGGTTAA 

Sequence ID 570 SEQ ID NO: 178 

ACCGCGGCCGCGT N A AN A A A A A A A A A A A A AG A A TTCCACTTGATC A AC T T A A T T C C 
TTNTCTTTATCTTCCCTCCCTCACTTCCCTTTTCTCCCACCCTCTTTTCCAAGCTG 
TTTCGCTTTGCAATATATTACTGGTAATGAGTTGCAGGATAATGCAGTCATAACTT 
GTTTTCTCCTAAGTATTTGAGTTCAAAACTCCTGTATCTAAAGAAATACGGTTGGG 
G T C AT T AAT AAAG AAAAT CTTTCTATCT T A A A A A A A A A A A A A A A A A A A A A A A A A A A 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAA 

Sequence ID S^r SEQ ID NO: 179 nt : 

457 

TTAGAGAGGTGAGGATCTGGTATTTCCTGGACTAAATTCCCCTTGGGGAAGACGAA 
G G GAT G C T G C AG T T C C AAAAG AG AAG G AC T C T T C C AG AG T C A T C T AC C T GAG T C C C 
AAAGCTCCCTGTCCTGAAAGCCACAGACAATATGGTCCCAAATGACTGACTGCACC 
TTCTGTGCCTCAGCCGTTCTTGACATCAAGAATCTTCTGTTCCACATCCACACAGC 
CAATACAATTAGTCAAACCACTGTTATTAACAGATGTAGCAACATGAGAAACGCTT 
ATGTTACAGGTTACATGAGAGCAATCATGTAAGTCTATATGACTTCAGAAATGTTA 
AAATAGACTAACCTCTAACAACAAATTAAAAGTGATTGTTTCAAGGTGATGCAATT 
ATTGATGACCTATTTTATTTTTCTATAATGATCATATATTACCTTTGTAATAAAAC 
ATTTTTCCC 



Sequence ID 572 SEQ ID NO: 180 
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CGTCTATTTGNGTTTCTTCTCACAATTGGTAAGTTCTCTGTATTGATTGATGGCTA 
AGTTTGATTAGTGTTTTTCTCTAGTTGGTAATTATATTCTAGTATTTTATCATCTT 
ATTGTTTACTCAACTNAAAGTGNCACAGAAGAGTTGCCAGGTTTCTCTTTGATATG 
AGAT C T C TNNT T GAT T T GGAAT GC AAAT C AN AAGT GT C AT GT T T T GAAT AAAGGG A 
CCAGATGACTTATAGGTATTCTTTCTCTAAATATAACTAAGGTAAGATTTTTGTTT 
TGAGGTACTTAATCTATATAAGTGGTAAAGAATTTACTTGAATTTCTCCAAATTCT 
CATGTCTAAAGTCTGATTGATTAAATTCATTCTTGGTATTTCATTTTGAAAAGAAT 
GTAGCTTTAGCAAACCTCTTTGTATAAATGCAGTGGGATTAAGGTCATTTAAAAAA 
TTGTTATATCATTGTATTTTTAAAATTTACCAGTTTTATTTTTCTTTTTACCCTTT 
AGCCCGGCCTCAGAAAGTGTGTTTGTGTCCATTTCTCCCAGCGCACCCTCTGCATA 
TCTCTACCCACTTGTCATAATTCAGCATCCAGCAGAGGAAAACAAAGTGTTGCGTA 
CAGTTCCTCTACTAGCAGCATGCCTCCCCCAGGACAAGTGTA 

Sequence ID 574 SEQ ID NO: 181 

TTATTGCTGACATAAAAATGGTGCACATCGGCCAGGGCCCAGGATGAATCAGCCAA 
TCTGCACCATTTATACATGGAACTGGAGAACATTGTGCCAATAATCATTTAATATA 
TGCCAAATCTTACACGTCTACTCTAAACTGCTCTAATGAAGTTTCAGTGACCTTGA 
GGGCTAAAGATTGTTCTTCTGGGTAAGAGCTCTTGGGCTGGTTTTTCANAGCAGAG 
TTCTTGTTGTGGGTAGACTGTGACTAGGTTCACAGCCTTTGTGGAACATTCCGTAT 
AAC GGC AT T GT GGAAGC AAT AAC T AGT T CC T AT GAAAGAAC C AGAGC T GGGAAGAT 
GGCTGGGAAGCCAGGCCAAAGTGGGGGCAACAGCTTGCTTCTCTTTCTCTTCTCAC 
CCTCAGTTTGTATGGGAAAATGGAGATGTCCTCTCCACTTTATCCCACGATATCTA 
AATG 

Sequence ID — 575 SEQ ID NO: 182 nt : 

209 

CAGGATATCGAGACCATCCCAGACAGCATGGTGAAACTCCGTCTCTACTGGAATAC 
AAAAAGTTAGCCGTGTGTGGTGGCACGCGCCTCTAATCCCAGCTATTCGGGAGGCT 
TAGGCAGGAGAATTACTTGAACCCGGGAGGCGAAGGTTGCAGTGAGCTGAGATCGC 
ACCATTGCACTCCACCCTGG-CGACAGAGCAAGACTCCGTCT 

Sequence ID §^ SEQ ID NO: 183 nt : 

541 

CAGCCAACCCAGAAGGAGCCAGTCTACAACTATGCCTGATCCTCCTCATGGCAGGC 
CACGAAGCATTGCTGCCATGTGTTGAATTATAAAACCCACATTGCTTTTTGAACCC 
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TGTTGCGGGTAAAAATAACCAAATTATCAGTCCTTGGAAACCCAGGCAATCAAGTG 
AGTACAAGGTAAAGATAAGTATGGTTTAGAGGAGAAATTATGTTCCTGAACTGGTG 
T CC T T T GAT GGC AGC GT C AGC C T T GC T AAGT C AGAGT AGAGGGAGC AG T GAC C T T A 
ATAAGCTTTGGTGAGCATCATGTGCACGCGTGGGTGGGAGTCCCTTTCACTGATGC 
T T T T AAAAGT GC T T T T GC AGACC C T GGAAGGGAT C C T C C AC AC AT AT GAGGT GT GG 
GACAGGTAGGCCAGAGAGGATTAGCCCTGCTTTCGAGACTAGAAATCTACAGTCCT 
GAAGGAGCAGTAATTAATTGGTACACCTGTCAGGGCCAGCCCCCAGGTCTCCTGGC 
TTTTTCCAGGTTTTCTGTCTCACATGATTTTGCTTTT 

Sequence ID 577 SEQ ID NO: 184 

C T T T AAT T T T T C AAG T GT T T AAAAAAC AAT T T TAT AC T T AAGCC AGCC T T G A AG AT 
AAGC AC AAAAT T T AC C AG T T T AC AT T T AAAAAAC A AAC AAAAAAC GAC AAC AAC T C 
AAGCACCCGCTCTGTGCATAGCACTATTCTAGGTGCAATAAAAGGGAATCTTAACC 
TTAGAAATATGAGTTCACTTTCTGGAATTGTATTATCTCCTTTTCCAGAGAGTAAA 
AATAAATAAAATCACCATTGTTTACTACAGATCTGCCCCAAACCACATCTGGTTCA 
CAGAAAGGCTAATTTCTGCCAAATTAAAGATGTAATGAACTCAGTTCCTGCTTTCC 
C AAAAAC AC G AAAGC AG AAT TCCTTTTCACT G AAAAAAAT AAAC AG T T T T C C AT GC 
AAGGGCAGTTTGCTTCTAATAAGTATTTTTTAAAAAATTTTTTTTTCCTCTAGCTT 
TTCTTTAAATTTTCTTCCTCTAATATTGCCTTTTCTTGTACAAGGCAGACCAGGTA 
T C T T T T T AT GC T GT T T T T CC T T T AC T AAG AAAAGT AT T GC AT C T T GAAGAC AAAC C 
ATT T CCC AGAGT AGT GAT AAAAAAT AAC AC TAAAA AAAC T T T AAAGGT GAGT C AC T 
T CAT C AC C T T GAT G AAG T AAAAAA 

Sequence ID 578 SEQ ID NO: 185 

GGAAAAAATATTTCCACTTAGATATTTTACATGGTTTTGTTTAAAATTACCATTAC 
TTGTTTTTTAAAAACACATGACCACATATGTATATGTATATCTACCTAAACATTGT 
ATCATGGTTTCAGTATGTTATTCATGTATTACTGGGAGATGCTACCAAGAAACCAA 
CCC AAAGAAAAT T C T GGAAAAT AC AT T T C T AT T T AT AGAAT AAAT GT T T CAT T TAT 
AT AAAAGC AAAAGAAC T TAG AG T T C T AAT AAAT GGGAT GT C T AAT AAAT TAT GAAG 
T TACT GAT TTGAATATAT TAT ATTTTTATAACTTCCTTGCCAAAGTCCT GAT TTAG 
TACATTAGAGAACCTGTGTTTCCTCTCTCCTCTACCATTCATCTCTCTTCCATACA 
GTCATTTGGGCTTTTTACTCAAAGAGAATCAAGAAATAATAAGGTATAACAAGCTT 
GGCAAAGTGTTGGCTTTTTAAAAAAAAATTTTTTTAATCTCTAGCAGTTTGGTAAT 
TTAGCAGCATCATTTATTTGGGATTCTTTTATCTGATTTCAACAGTGAAAAACATC 
CCTATGATAAAGCCTAATGACCCATTTCCAAAAGATGGAATTGCCCTTCCTAGAAA 
ATATGACGGAGAAAAGT 
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Sequence ID §^ SEQ ID NO: 186 nt : 

502 

CGAATAGCCAAGTGGTCTGACAAGATCGAGAGTAATGAGGCCCATACTTTAGTACA 
GTCTTGAATGGCCAGATGGTGCTGGGCATACCCCAACCAGAGATATGTAAGTCTTT 
ATGTTGTCAAAATTTCCCAGAAACATGAATTTCCCACTAAGATTCATTAAGGAAAA 
C T AG AAT G AAAAC AAAAAC G T T C C T T G T AT AAT AT T CAT T AN AAAG AAAT G AAG AA 
GGCCGGGCATGGTGGCTCACGCCTGTAATCCCAGCACTTTGAGAGGCCAAGGTAGG 
CAGATCATGAGGTCAGGAGTTTGAGACCAGCCTGGCCAACATAGTGAAATCCCGTC 
TCTACCAAAAATACAAAAAAATTAGCCGGGCATGGTGGCACACACCTGTCATCCCA 
GCTACTCAGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGGAGGTGGAGGTTGCA 
GTGAGCTGAGATTGCACCACTGTACTACAGCCTAGGTGACAGTGCAAGACTCTG 

Sequence ID §-8-Q- SEQ ID NO: 187 nt : 

316 

CCTATGCCAAACTAAAGAAAGCTTGCCTGGCCTACAGGCCTAAAGGTTCAAATGNG 
GATTAAAAAAACACAGTAGTCACATAAAATGTCTGCTGGCTGGCTGGAATTCCATC 
ACCTACAATTTACCTGCTTTCAAAAACTGTGTTCAACATTGAGAAAACAGAAAACC 
ACTTATCTTGAGCTTAATATGGGCTTCTTTTTCCTTAACTGTAGAACACTTACTGA 
AATATCAAATCAATGGTTAGGATATGTATCCTAGGCAGGCCTAAACCATTAACACT 
T GGT T T AAGC AAC T T T GT AT AAT TNAC C T C C T AAAT 

Sequence ID 581 SEQ ID NO: 188 

CTTCATGAGTGCCCGGTTGCCCAAGTCAAAAACCTGGGAGTGATATAAACTCCCCA 
CACATCCAGTCAGTCACTCATCAACTCTATTGATTCTG-CTGCTAAATATATCTCA 
ATTGTATTAACTTAAACATATGCATAATACATCTTCTTCTTCACTGCATTTTTGTG 
GGCTGCACTTACCTTTCAGGTAACAACAACACTGGCCCCTCTTGCCCTTCTAGTCA 
G AAG T GC C AAAAT GAT GAGAGC T AGCC AT GAC AAACC C AC AGCC AAC AT T AC AC T G 
AAT G T G C AAAAC T G G AAG G G C A T C C AAAC AG AG G AG G 

Sequence ID 582 SEQ ID NO: 189 

TAGAATTCTCGCCTGCCTTGGCTTCTCCCTCTAGTTGTTCCTTCTCTGTCTTCTGT 
GGGCTTCTTATTGTCTGCTCACTCCTTCTTCAGTGTCCTCTCATGGGCTTCCTTCC 
CTTCTCAGCTGATGCCATCACCTGGGGAATCACAGTTACTCAGCAGCACTGGGGCC 
TCTCTATCTCTATGCTGGTCATGCCTATGTGTGAGCTGCAGACCCAGTGGAATTTC 
CATTTGTGCATCCCATGCCCAGCCCACCCTCCACCAGCCTCGAATGCAGCTGTTCA 
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GCCCTACCCCAGTCCTCAGAAAAGTTCCTCTCCCTGGATCCTCTTTTTCCTTCATG 
AGTGCCCGGTTGCCCAAGTCAAAAACCTGGGAGTGATATAAACTCCCCACACATCC 
AGTCAGTCACTCATCAACTCTATTGATTCTGTCTGCTAAATATATCTCAATTGTAT 
TAACTTAAACATATGCATAATACATCTTCTTCTTCACTGCATTTTTGTGGGCTGCA 
CTTACCTTTCAGGTAACAACAACACTGGCCCCTCTTGCCCTTCTAGTCAGAAGTGC 
CAAAATGATGAGAGCTAGCCATGACAAACCCACAGCCAACATTACACTGAATGTGC 
AAA AC T G G A AG G G C A T C C A A AC AG AG G A 

Sequence ID §-&^ SEQ ID NO: 190 nt : 

631 

C T G AGG T GGG AGG AT T C C AC T C T C AC C C AT TTCTTCTTT CAT T T T C AG TTTCTCCA 
G T T AG T AAC T G AAG A T G T T C T T T GAG T AA T T AAG T GAG T G AG AAAA T T T T T AAG T G 
AGAAATCTATAAAAAGAACCATGTTAACATAAATATTTCAGTCCTTACAAGTTGGT 
ATTGACTTTTCTCATTGGTAATCTGACTGATTTAATACTGCTCATTCCAATATCTG 
GTGATGTAATTCTGGTTATGAATCCTTGTATTAATAACACCTCCTGGGAGGTTTTT 
TTTCCCCAACATTACATTCAGAATATTAGAGCTGAAAATACCTTTTTTAAGGTTAT 
CAGGAGGAGGGAGCTTATGTTTAATGTGGTGGATAAAACTTAACTGCTGGTTAATA 
CAATTGTTATTCAGGTGAAATTCCCTAAACTTTTCACGTGCAAAGTTTTGTATGTA 
TACAGACATTTGGG G A A A AG TTTTATCATCCCT A A A AC CGGTTACTGTC C AG A A A A 
TGATAAGAATCCCTGGGTTCCAAATCCTTCATAAGGTATTTATTCATTTATTTATT 
CAACACATTTACTCAATGCCTCCGCTCTGCTGCAACTACACTGACATTCTGCTTCT 
AATCT AAC C GAAAAT 

Sequence ID 585 SEQ ID NO: 191 

TTTCAAATTGTACAATAACACAAACAACTTTGTTAAGGCCATGTTTTATTTGCTGA 
TTAATGGACAAAAGGCAATGTAATTTATTTTCAAGTATTTTCTTGAAAGTCTGTGC 
T C AT AAAAAT CAT GAAAAG T T GG AAAG AC T G T TAAAT C AC T GAAAC T T C AAAT AT A 
T C T T AC AC AAT C T T G T T T GT AC AAAAAT AC AAGT T AAAT AT AAAC AT AAAGC AAT C 
ATGGTAATTTTATGCAAATCTGTTTTATGTGATCATCAGTTATATATAAAAGTTTC 
TCAGTTCTGTTATTTGTGAAAAGATCAATACCAGATTGAATGACTACCTATTGGCA 
AAGGGCCCTAAAAAGCTTACTTTAGCACTCATCTTTTACATGGTTAAATGCATTTC 
CTAATTTGAGATCACCTAAACACTGGAAAAGAAAAAAAATGAAAGGGCAGTATGTC 
CATAAACCAACAAATAATTTGGCTGTAATGTATCATAAAACACAAACCCCACACAT 
CTGTACAATAAACATTATGTATTACATACACACAACACACACCCAGTCATAAAGCC 
TAATGATGTGCTGCTTCCAGTTCAATATTCAGCTGTGCATTTTTTCTTATTTCATC 
AAATGAATAGCTTTTTGTCACC 
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Sequence ID 586 SEQ ID NO: 192 

GTAAACTGTTCTCTCCGAGGGAAAAAATGGAAGTTATCCTCACAGTTCACTGCCGT 
GGTATTTCTTCTGTCCCATGCTTTGCATGACTGCCATGGTACAGCCTTGTTTCAAA 
CTGTTCACTGTGATCTGTGGGTCTTTGAGTTTCAGTGAGTTTGCTGAAATGTCGAA 
GAAGTAGTTCCAAACTTCAATGTTCAATGAAATTTTTGTTCAAGTTTGAAATGGAG 
AGAGCAGCTTTAAAAGGTACTAAGCCTTTTACAAATTGGTGAGTACTGGCACATGA 
GAT 

Sequence ID 587 SEQ ID NO: 193 

TTTTTTTTTTTCCT T AAAAGGT AAC CC C T AAAC AC AGC T AAAAC T AT GC C AT C AGC 
T GAC T C C AAGGNAC AC AC AGT CC T G T AT C T GGAAC T AC T GAG T GGC AGGC AT C T T T 
CTCTGCCTCTGACAGTGGAGTCCCCATCACTGCAGAGCATAGCCAAAGGAGTCAAA 
GGTCTCAGCGGGTCACTGCCTTATCAACCCTCACCAGTCCCTTATGTTTTTTAATA 
TTTTATAATCTTGACATGACACCAAGATGCTTTAATAAAAAAGCACCTCTAACTCG 
GTCTTGTATTCACTTACCTTGAGCCTGGGACTTCTCTAGGCTCCTGAGGCAAAAAC 
AGGTAGAGGGGAGATGGTGGAACATAAAACACAATTTTGCTTGGCACCCACCTTGG 
CGTCTGTCCCCATGACCAGGTCTTTCAATTCGATGATTTTGTCATTGATGGAGGAG 
CGATATCGTTTCTCAATGATATTATGGGTTGTCCGCCTTTCTCCTTCTTTGGGGGG 
CTCAAGCTGCTTGACTCCCCCAGGTACCTGCTTAATGGGGCACTTTCTCTTGCCCC 
ATCATTACAGGCATTGTGGTCAGAATGGTCCCACTGCTGCCCACCAGGGTCTA 

Sequence ID 588 SEQ ID NO: 194 

CTAGTCTTTTCATAGTCTGCATAGAGTCTGGCCATTACCATCAGTTTTTAAGATGT 
CCATATTGTGGCCGGGCGCGGTGGCTCACGCCTGGTAGTCCCAGCACTTTGGGAGG 
CTGAGGCAGGTGGATCATGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGTG 
AAACCCGTCTCTACTAAAAAAAATATTAAAAAATTGGCCAGGCCTGGTGGTGGGCG 
CCTGTGGTCCCGGCTGCTTGGGAGGCTGAGGCAGGANAATGGTGTGAACCCGGAAG 
TCGGAGGTTGCAGTGAGCCAAGATTGCACCTGGGCAACACAGCGAGACTCCGTCTC 
AAAAAAAAAAAAAA 

Sequence ID 589 SEQ ID NO: 195 

CAATTATTTATTACCTTTCCATTTGTTCGCCTGATGATGTGACAATGCATGGTCTT 
TGTGCATGCTGCTAGACACTTTTCTTTCCCAGCCGAAAAGTCTATTATGTAATTTT 
TACATTCATAATTTTAATGTGGATGATCAGGATTAAATCAAGATATATATCTGGAA 
C C T C T T AT AAAT GGAGC AC T T AGAAAT TTGTTGTTC T GC AC T T AACC T AGAGAGAG 
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AAAAAATGCTTTTCTTTGTGAAAAATCTGAATTCCTGTCCTGACCTTCTGTGATGT 
GGAAACCCTAGGCTCTGAGACACACTCTCTGGTGTCTGAGACAGAACCAAAGCAAT 
AACGTTGTGATGCCCACAGGCCTGGAGCCAGCTAGCGACCTTGTGCCGCCCAGCTG 
T C C AT GGC C C G T GC AG AGC AG AGG AC AG T GAG T G T C T GC AC T G AG AAC C T T AAAC C 
AC AG T T G AAC AT AC C C AC AC CTGTTTGTCTT AAG C T AT AG T G T AAAAAC AAAG T T T 
GGGCTCTGAAAATTTAACTGAAAAAGATTTCCTTGTT 

Sequence ID 590 SEQ ID NO: 196 

GTGGCAGCAGGCGCAGCCCAGCCTCGAAATGCAGAACGACGCCGGCGAGTTCGTGG 
ACCTGTACGTGCCGCGGAAATGCTCCGCTAGCAATCGCATCATCGGTGCCAAGGAC 
CACGCATCCATCCAGATGAACGTGGCCGAGGTTGACAAGGTCACAGGCAGGTTTAA 
TGGCCAGTTTAAAACTTATGCTATCTGCGGGGCCATTCGTAGGATGGGTGAGTCAG 
ATGATTCCATTCTCCGATTGGCCAAGGCCGATGGCATCGTCTCAAAGAACTTTTGA 
C T GG AGAGAAT C AC AGAT GT GGAAT AT T T GT C AT AAAT AAAT AAT GAAAAC C T AAA 

Sequence ID 591 SEQ ID NO: 197 

CAGCAGCAGAAATGTTTGCAAGATAGGCCAAAATGAGTACAAAAGGTCTGTCTTCC 
ATCAGACCCAGTGATGCTGCGACTCACACGCTTCAATTCAAGACCTGACCGCTAGT 
AGGGAGGTTTATTCANATCGCTGGCAGCCTCGGCTGAGCAGATGCACAGAGGGGAT 
CACTGTGCAGTGGGACCACCCTCACTGGCCTTCTGCAGCAGGGTTCTGGGATGTTT 
T C AG T GGT C AAAAT AC T C T GT T T AG AGC A AG GGC T C AG AAAAC AG AAAT AC T GT C A 
TGGAGGTGCTGAACACAGGGAAGGTCTGGTACATATTGGAAATTATGAGCAGAACA 
AATACTCAACTAAATGCACAAAGTATAAAGTGTAGCCATGT 

Sequence ID 592 SEQ ID NO: 198 

T AC T C AAT G AAAAAC CAT GAT AAT T C T T T GT AT AT AAAAT AAAC AT T T GAAAAAAA 
AAAAAAA 

Sequence ID §-»^ SEQ ID NO: 199 nt : 

565 

CAGGATCAAGGTGAAAAGGAGAACCCCATGCGGGAACTTCGCATCCGCAAACTCTG 
TCTCAACATCTGTGTTGGGGAGAGTGGAGACAGACTGACGCGAGCAGCCAAGGTGT 
TGGAGCAGCTCACAGGGCAGACCCCTGTGTTTTCCAAAGCTAGATACACTGTCAGA 
TCCTTTGGCATCCGGAGAAATGAAAAGATTGCTGTCCACTGCACAGTTCGAGGGGC 
C AAG G C AG AAG AAA T C T T G GAG AAG G G T C T AAAG G T G C G G GAG TAT GAG T T A AG AA 
AAAACAACTTCTCAGATACTGGAAACTTTGGTTTTGGGATCCAGGAACACATCGAT 
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CTGGGTATCAAATATGACCCAAGCATTGGTATCTACGGCCTGGACTTCTATGTGGT 
GCTGGGTAGGCCAGGTTTCAGCATCGCAGACAAGAAGCGCAGGACAGGCTGCATTG 
GGGCCAAACACAGAATCAGCAAAGAGGAGGCCATGCGCTGGTTCCAGCAGAAGTAT 
G AT GGGAT C AT CC T T CC T GGC AAAT AAAT TCCCGTTT C T AT C C AAAAGAGC AAT AA 
AAAGT 

Sequence ID 594 SEQ ID NO: 200 

CAGAAGAGTAAGCAAATCTCAAAGCAGCGAAAGGGAAGAAACTAAAAAAGGTAGAG 
CAGAAATAAGAGAAAATAGAGAAGAGAACAATTGAGAAAAATAATTGAAACCAAAA 
GGTGGTTCTTTGAAAAGCCTAACAAAATGGACACATCTTTAGTTAGAGTGACCAAG 
A A A A A AG GGCAGTGAC T C AGAT T AC T T CAT T C AAG AGT GAAAGAGGGC AC AT C AC T 
ACCAATTTACAGAAATAAAAAGGATTATGAGGAAATACTACAGATAATTGATGACA 
T T AAC T T AGAAGAAT AT AT T T C AAGAAAGAC AC AAAC T AC T GAAACC GAC T C AAG A 
AGAAAC AGAAAAT C T GAAC AGACC T AT AAAAAAT AGAG AT T T AAT T GAT AT T C AG A 
AAGTTTCCCAAAAAGAAAAGCACTGGCCAAGATGACTTCACTGGTGAATTCTATCA 
AG T G T C AAAG AT G AAT T AC T GAC AT T CAT T C AC AC T C C T T T AAG AAAT AG AAG AG G 
GGACATCACTTTTCAAAGCATCGACATTCTAATCATTAGTCCCTTGGTTTCCTGCT 
CCCAAAGCCAGGTGATGTATCACAAAAAAACCCCTACAGACCCACTGGGCACAATG 
GCTTTATGCCTAT 

Sequence ID ^& 3EQ ID NO: 201 nt : 

98 

CTTTGCTCGAATNGTCAGATAAGGATTCTGTGAANGGAGATGAGATTTCCATCCAT 
GCTGACTTTGANAATACATGTTCCCGAATTGGGGNCCCCAAA 

Sequence ID 596 SEQ ID NO: 202 

C TCAAGTGTTCCCTCAGCTTAGGCTTTGTTT AAAT GAT CCCACCCAGGGGCGATGG 
TAGGGAACAACAGGGTCACTAAACTATTTGGCTGGCTACAACTCTGGGAAATGGTA 
AGACAGGGAAAGGCCATGTTGTTCATTCCCTTGTGCAGATCTAGGGAGAACCGCAG 
AGAGAACAGTTAGCATTTCTTGTTCAATGAATTATCCTATTAAGAACACTGGATGT 

Sequence ID 597 SEQ ID NO: 203 

CGGNCGCGGTCGACGCTACTCCTACCTATCTCCCCTTTTATACTAATAATCTTATA 
AAAAAAAAAAAAANAAAAAAAAAAA 



Sequence — £©■ 
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362 

GGCATGTGCCTGTAGTCCTAGTTGCTGAGGTAAGAGGATTGCTTGAGCCCAAGAGT 
TCAAGGCTGCAACAAGCTTTGATTGCGCCACTGCACTCCANCCTTGGCGACAGACT 
AAAACGCT GTCT C A A A A A A A A A AC A A A A AC G ACN AAAAAAAAAC AAAAC AG AAAAA 
AT T AAC T T AGGC AAT GAC AGT CC C T GGC AAAT GC T GGGAGGG AGGC AAC ANT GGT C 
AAGGAAGGTAACCCTGAANCAGGACTTGTAAAGCAAATAANATTGGGAGGCCAAGG 
TGGGTGGATCACNAGGTCAGGAGTTCGAGACCAACCTGGCCAACATAGTGAAACCC 
C G T C T T T C T AAAAAT AC AAAAAAAT T 

Sequence ID 599 SEQ ID NO: 205 

GAC AAAAGAAC CAT T T GG AT AC AT AGGT AT GG T C T GAGC T AT GAT AT C AAT T GGC T 
TCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGAC 
ACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAA 
AGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAG 
TGCTCTGAGCCCTAGGATTCATCTTTCTTTTCACCGTAGGTGGCCTGACTGGCATT 
GTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGC 
TCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCA 
TTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATC 
CATTTCACTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCT 
CGGCCTGTCCGGAATGCCCCGACGTTACTCGGACTACCCCGATGCATACACCACAT 
G AAAC AT C C T AT CAT C T GG AG 

Sequence ID 6-9-9- SEQ ID NO: 206 nt : 

595 

TTCAAATTCTTGNTAANAGTCTTTGTTCTGAATTTTACTTTGTCTGTTATTCCTAT 
AGCCTTTCCAATTTTCTTTCGCTTGGATTTTACGTGATAAGTTTTTTCCCCCATTT 
TACTTTTANCAACTCTATATTTTTTAGTTGAGGTTGGGTTTCTTGTAAACAGCATA 
TAATTTGGGTTTTTTAATCCAATCTGAAAATTAATGTCCTTAATTTTGTGTTTATA 
CCATTTACACATAATGTACTCATATATAAGGTTTAACTGAAACCTACTATCTTGCT 
AGTTGTGCTCTACTTGAATTTTTTTTTAGTATTCTGTTTTAATTGACCAACATTTG 
ACTGTATCTCTTTGTGTAATTCTTTTACAGGTTGCTGTAGGCATGACAATATATAC 
ACTTAACTTTTCTCAGTACACTGAGAGTTGAAATTGTAGTACTTCGAGGAAAACAT 
AGAAAACTTGCAATGATATCGGTTACATTTTACCACCTCCATATGTTGCAATTATT 
AAATGTATTAGATCTGCCTACCTCGAAAACCCATCAGTCTTTTAACTTTGCTCTCA 
ATGGTGATTCATATTTTTAAAAAAACTTGAGGCAA 
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Sequence ID &#± SEQ ID NO: 207 nt : 

522 

TCGACCGGGTTTGGAGCAGTGCCTTGTTTGCTGTGCAGCGGATACTCTACAGGTAC 
AT TTCCTTTT T GGAACC AAAAGGGAGGG AT T T GAC AAT AT T G AT GGT AGAT C T T T T 
TTCTTTAGCAAGAATTAAGGATTTTGGTGGGTGGGGGGAGGCTTCTGTGGGGACCA 
AGACAATGTACTGTCAGTCAGGATTTAAGTCGAACTACCTCATCCCTTGCCCCAGA 
GAACAGTTGATCGTGTTTTAAACCAAAAGGTGCGGAATGGAGAGAGGGAGGCGGTG 
CATTGCAGCTTCCGATAGAGCTTTTTATTTTTGGATATCAGGAACCAATTTTGAAG 
AT T T C T T AAGAAAGT C AT T T AC AT C AGGGAC AT GAAGAGC AAAGT AGGT AT T T T T G 
GTCAGTACTTGAATTTGATAGGCTTTATGCAAACAACTCTCCCTCTGCTGGAGTCT 
G G C AAG TTTGCTTTT C AC T G GAC G C T AAT T C AAG T G C C A T AC AAAAC T AAAA T AAN 
AGTTTTACTTATAACACA 

Sequence ID 602 SEQ ID NO: 208 

C AG AAAT C GC AAT T G AAG AC C AG AT T T G T CAAGGT T T GAAAC T GAC AT T T GAT AC T 
ACCTTCTCACCAAACACAGGAAAGAAAAGTGGTAAAATCAAGTCTTCTTACAAGAG 
GGAGTGTATAAACCTTGGTTGTGATGTTGACTTTGATTTTGCTGGACCTGCAATCC 
ATGGTTCAGCTGTCTTTGGTTATGAGGGCTGGCTTGCTGGCTACCAGATGACCTTT 
GAC AGT GC C AAAT C AAAGC T G AC AAGGAAT AAC T T T GC AGT GGGC T AC AGGAC T GG 
GGAC T T C C AGC T AC AC AC T AAT GT C AAT GAT GGGAC AG AAT T T GG AGGAT C AAT T T 
AT C AG AAAG T T T G T G AAG AT C T T GAC AC T T C AG T AAAC C T T GC T T GG AC AT C AGG T 
ACCAACTGCACTCGTTTTGGCATTGCAGCTAAATATCAGTTGGATCCCACTGCTTC 
CATTTCTGCAAAAGTCAACAACTCTAGCTTAATTGGAGTAGGCTATACTCAGACTC 
TGAGGCCTGGTGTGAAGCTTACACTCTCTGCTCTGGTAGATGGGAAGAGCATTAAT 
GCTGGAGGCCACAAGGTTGGGCTCG 

Sequence ID GQ^ SEQ ID NO: 209 nt : 

624 

GACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGT 
CAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTG 
CAGTGCTCTGAGCCCT AGGAT TCATCTTTCTTTTCACCGTAGGTGGCCTGACTGGC 
ATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGT 
AGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCT 
TCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAA 
ATCCATTTCACTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTT 
TCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTACCCCGATGCATACACCA 
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CATGAAACATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTA 
ATAATTTTCATGATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGA 
AGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACAT 
TCGAAGAA 

Sequence ID <&&^ SEQ ID NO: 210 nt : 

338 

ACCTGAGGCCTCGGTGGGGCCAGTGCGACGCTGGCTTAAGGAGCTGGAGGGGTTCC 
TAATACACATTTAATTCAGTTTCTCTTCCCTAAGAGGCTGCCGGAGTTGGGGCCTC 
CTCCAGCAGAGACCCTCGGACCCCTGCAGGGCCTGGACTTGGGGTGAACAGGGCTT 
CAGTCAGCGCAAGTATTCCATTTGCATTTGGTAATTTTTCATGCCACCTATTTATG 
AAT AT AT AAAT C T T T AT ACC AAAT C T AT T T T T T AAAAC AT GG AAAAGT T GC C T T T A 
TGGAAACTTGGCAGAGCCAGAGTGTACACATTCCTAAACCATTAAACAGATTTCTA 
TA 

Sequence ID ID NO: 211 nt : 

556 

GGATAATGATACCTCTGACCTTTCTTCCTTTTGGGAAGTACTTGAGTGTGCAGCTG 
CATGAGGCCTCAGCAGGAGAGAGATTTTAGGTCCAAGAAGCTATACCAGTAGGACA 
AGGC AGGAAAAT AC T AC AC T T T C AGGAT C AAGCC C C T C T GAC T C T CAT T T GGAAAC 
TGGATGTTTGCTAAGCACCTGCTTCTTAAGGATGCCGAGGGATTTAATGATACTCC 
CAGAAACCTGGAGAGATTAATGGGGCCTATGGAGAAGTGCTCTGAACTCAGTGTTG 
GGACTTGAATAAAATTAACCATTGTCATGTTTTCAGAACAACTAAGCTGTTTTATA 
TTTCATGTGCATGAAAGCCCTAGAACTAAGTTGTGTTATTTCCAGAAATGAAATAG 
ATCCCACAGTTAGATGATGTGGCCATTAGGAAGTACCAAATTTATAAAAATCACTG 
GAGGTCTGTCTGAGCAGTACCTAATAAAATATAGTATACTGAAAGTGAACAGATCT 
TTGTCTCTTTCTTTGGCTGCTTGATACTTTATCTGTGTCTGCCGGACAGTGC 

Sequence ID 607 SEQ ID NO: 212 

CAATAAAAGCAGGTTAACCTCAATGATAGCAGTTAAAATGTTCTATCTTATGTATT 
TCTTTTAAGTATTACCATTATGGTGCTACTGAGCGTTTTCTTTTGGTAAAAAGAAA 
AATGCCATGGGCTGCAGTCTTCTTCCATCACTTTTCCCTACCAGGTCCATTAATAT 
GCTTATAACACTAGTGCCAGTTATTTTATTTGATAATGCTTATGGTATTTGTATAT 
TTGTTTGCATTCCAATTTTGTTTAATAATGAGTGTGTAAACTGCATACGTTAAATA 
AATGTAAATACTAATGTACTGCTGC 
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Sequence ID 609 SEQ ID NO: 213 

TTTTATTACCCAAGTTTTAACCTCTGTCTGGTGATTTGTTGTTGTTGTTGTTGTNG 
TTGTTGTTGAAGTTCAGGCTGCATGTGGGATAGGTTTGCTCAGGCATACTTCTTAG 
G AAG T AGT C AC T T GC AT G AC T G T T T T T GGGAT AAC T C T T T GAGT AT T T GGAG AGG T 
CTATTGTAACTTCTGAAAGGCATTGTTTTTACGTATGAATGTTCTAAAATTCATTC 
TAAATGGTCATGAAAAGAAAAGGATTCACATTTTAGAATGGCAATAGTCCCTGAGG 
ACTATTATGTCTTTTAGATTTCCTGTGGGTTTCTAGGAATGTTAGTGTAACTTANA 
TTTCCACCTACCTGATTTCTGGATGTGCCTATTGGAACTTGCTGAGATCTTTTTTT 
TTCCTTAACATGTTGTCCCCTTGACCCGTACTTCGAAACTAAACATATTATTTTAT 
TTGCTTACACTTCAGGAGGCAATTGGCAGACACCAGGCCAACAGTCT 

Sequence ID 61Q SEQ ID NO: 214 

GCTCTGACCCCAGTTGGAAATGTATCTGTACTTTGTCCGGCTTCCACTCAAGGACC 
ATTTATGACATTGCTTGGTGTCAGCTGACAGGGGCTCTGGCCACAGCTTGTGGGGA 
TGACGCGATCCGCGTGTTTCAGGAGGATCCCAACTCGGATCCACAGCAGCCCACCT 
TCTCCCTGACAGCCCACTTGCATCAGGCCCATTCCCAGGATGTCAACTGTGTGGCC 
TGGAACCCCAAGGAGCCAGGGCTACTGGCCTCCTGCAGTGATGATGGGGAGGTGGC 
CTTCTGGAAGTATCAGCGGCCTGAAGGCCTCTGAGCTACCTCGACTTTGGACAGAG 
T AAT G AC T C C C C AG AAAAC G T CAT AT AAG AC T T T AC C AGC C C C T GAG AG G AC C AG G 
AGGAGCATCCTTGACCTTCATTTAACTTGGCTCACTTCTCTTCANACTTGGGTAGA 
AGTGCAGAGCCACAAAATTGCTTTCCTTCCCCGCCTTTGACATGAGGCCTTCAGTA 
AAG 

Sequence ID 611 SEQ ID NO: 215 
TGCAGGATCCGTCGACT 



Sequence ID — 612 SEQ ID NO: 216 nt : 

576 

GAGAAATATAAGATTATGTATAGATCAAATCTACCTCTATTTGGTGTCCTGAAAGA 
GATGAGGAGAATGGGACAAACTTGGAAAGCTTATTTCAAGATAACATTCCTGAGAA 
CTTCCCCAATCTTGCTAGAGAGGCCAACATTAAAATTCAGTAAATGCTGAAAACTC 
CAGTAAGATATTTCTTAAGAAAATTATTCCCAAGATATATACTCATCAAATTATCT 
AAGGTCAAATGAAGGAAAAAATTTTATAGGCAGCTAGAGAGAAATGTCAGGTCACC 
T AC AAAGAGAAT GGC AT AAGAC AAAAAG T AG AAC T CC C AGC AGAAAC T CTAAAAGC 
C AGAAGAGAT T AGGGGC C AAT AT T T AAC AT T C T GAAAGAAAT T CC AAC AAGGAAT T 
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TCATATCCAGCCAAACTAAGCTTCATAATTGAAGGAGAAATAAGATATTTTCCAGA 
CAAGCAAATGCTGATGAAATCCATCACCACCAGACCTGCCTTATAAGAGCTCCTGA 
GGGAAGC AC T AAAT AT T G AAAGGGAAGAAC TTTAT GAACC AT T T C AAAAAC AC AT T 
TAAGTNCACAAAGCAG 

Sequence ID ^SEQ ID NO: 217 nt : 

341 

CCTTATTTTACAGGTGAAAAACCACGAATCAGATAGATTTTTATTTGCCCAAGTCA 
CATAATATTAAGAACAGGCCAAGTGTGGTGGCTCATGTCTGTAATCTGAGCACTTT 
GGGAGGCTAAGGCGGGTGGATTTCCTGAGCCTAGGAGTTTGAGATCAGCCTGGGCA 
ACATGGCGAAACCTCATCTCTACAAAACATACAAAAATTAGTCAGTGTGGTGGTGA 
GAGCCTGTAGTCCTGGCTACTCGTGAGGCTGAGGTGGGAGCATCACCTGAGCCTGG 
G AAG T CGAGGC T GC AGT GGC AAC AGAAT GGGT AAC C T GGAC AT C AGAGT GAG AC C C 
TGTCT 

Sequence ID 614 SEQ ID NO: 218 

CTCACACCTGTAATTCCATTACTTTGGAAGGCTGAGAGAGGAGGATCAGTGGAGCC 
CAGGAGTTTGAGACCAGCCTGGGCAATATAGGGAGACCCTGTCTCTACAAAAATGA 
AATAGCCAGGCGAGGTGGCATGTGCCTGTGGTCCCAGCTACTTGGGAGACTGAGGT 
GGAAGGCTGCCTTGAGCCCAGGAGTTCCAGGCTGCAGTGAGCCATCATTATGCCAC 
T GC AC T C C AAC C T GGG AG AC AG AG T GAG AG AG AC C C T G T C T C AAAC AAAC AAAC C C 
AAAATAGGCCAGGCACAGTGACTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGA 
AATAGGCGGATCATTTGAGGTCAGGAGTTCAAATTCAAGACCAGCCCGGCCAACAT 
GGCAAAACCACATCTCTACTACAAATAAAAAATTAGTTGGGTGTGGNGGAGCATTC 
CTGTAATCACAGCTATTCAGGAGGCTGAGGCATGANAACCGCTTCA 

Sequence ID 6^5- SEQ ID NO: 219 nt : 

379 

TAAATTTAAAACATTTTAATTAGCTGGCATGATGGCATGCACCTGTAGTCCTACCT 
ACTTGGGAGGCCAAGGCAGGAAGATTGCTTGAGCCCAGGAGTTTGAGCTTACTGTG 
AGCTGTGATCACACCACTGCACTCCAGCCTGGGTGACAAAGGAAGACCGTATTTCT 
AAAAAATAAAAAATACAAATACAACTACAAACTAGCACTAGACCAACAGTGACTAT 
GTACCATGAACTGAGGAATATTATTAATTCCACCATTTGCATCTGAGGTTAACAAT 
ATGTCAATGACTTAAATAACATCATATCTCTGAGAGTAATTTCTCCTATATTTCCA 
T G AC AAAT G T T AG AT AAT TTTCCATTTTTTC CAT T C AAC AAAA 
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Sequence ID 617 SEQ ID NO: 220 

TTTTCAGGCATGTCAGAGAAGGGAGGACTCACTAGAATTAGCAAACAAAACCACCC 
TGACATCCTCCTTCAGGAACACGGGGAGCAGAGGCCAAAGCACTAAGGGGAGGGCG 
CAT AC C C GAG AC GAT T G T AT G AAG AAAA T AT G GAG G AAC T G T T AC AT G T T C G G T AC 
T AAG T CAT T T T C AGGGGAT T GAAAGAC T AT T GC T GGAT T T CAT GAT GC T GAC T GGC 
GTTAGCTGATTAACCCATGTAAATAGGCACTTAAATAGAAGCAGGAAAGGGAGACA 
AAGACTGGCTTCTGGACTTCCTCCCTGATCCCCACTCTTACTCATCACCTGCAGTG 
GCCAGAATTAGGGACTCAGAATCAAACCAGTGTAAGGCAGTGCTGGCTGCCATTGC 
CTGGTCACATTGAAATTGGTGGCTTCATT 

Sequence ID 6^8- SEQ ID NO: 221 nt : 

598 

GATTAACTTTCATTTTAAGCTCTTCTCTACTAATTCTGTTCGTATGTTTATTCATT 
TTGCGTTGATCATATTTTGTACACCAGGCACTCTTCTCAGTTTTATATGTGTGTTA 
ATTTACTCCTTTCAAGAGCCCTATGATACATGAATTTATCTCCATTTTATAGATGA 
GGAAATTAAGACCTAGAGTTACTGAACTTGCCCAAGGTTATACAGCTGATGGGTAG 
GGC C AG AAC T T T GC C T C AG AG AAT C T GAAT T T C C AAAAAAT AAC C T AAAAG AG AAA 
T T T AAGT AC T AAT T AGT AAGC AAAGAAAT GC AC AT T T AAGGAAGAC AGT GC AC AT T 
T AAG G AAG AC AG T AAC CTTTTATCTATT AG AG AAAAAC AC AC AT TCTGTCTTT AAC 
ACACACATAAATCTTATATTGGCAGGGATTTTCTTTATTCAGCAATTATTTATTGG 
TTGTCTGCTTTGTGGTACACATAAATGCTGGGGATAAACACTTAATAAAATATACT 
TCCTTCTCTTGAATATCTTGCACTTTAAGTGGGAAGGTAAGTCAACAGAGTAGAGG 
TGATATATCCAAGTGATAGACTGTTTCATTGCCAGTAG 

Sequence ID 619 SEQ ID NO: 222 

GTTGCCTGAGAGTGACCTTTGCATCTGCCTGTCCAGCCAGCATGGAACCAAAGCGG 
ATCAGAGAGGGCTACCTTGTGAAGAAGGGGAGCGTGTTCAATACGTGGAAACCCAT 
G T GGGT T G T AT T GT T AGAAGAT GGAAT T GAAT T C TAT AAG AAG AAAAG T GAC AAC A 
GCCCCAAAGGAATGATCCCGCTGAAAGGGAGCACTCTGACTAGCCCTTGTCAAGAC 
TTTGGCAAAAGGATGTTTGTGTTTAAGATCACTATGACCAAACAGCAGGACCACTT 
CTTCCAGGCAGCCTTCCTGGAGGAGAGAGATGCCTGGGTTCGGGATATCAATAAGG 
CCATTAAATGCATTGAAGGAGGCCAGAAATTTGCCAGGAAATCTACCAGGAGGTCC 
ATTCGACTGCCAGAAACCATTGACTTAGGTGCCTTATATTTGTCCATGAAAGACAC 
T GAAAAAGG AAT AAAAG AAC T GAAT 



Sequence ID 621 SEQ ID NO: 223 



- 197- 

Marked-Up Copy 
TGGTACTGAACCTACGAGTACACCGACTACGGCGGACTAATCTTCAACTCCTACAT 
ACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATC 
GAGTAGTACTCCCGATTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTC 
TTGCACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACG 
TCTAAACCAAACCACTTTCACCGCTACACGACCGGGGGTATACTACGGTCAATGCT 
CTGAAATCTGTGGAGCAAACCACAGTTTCATGCCCATCGTCCTAGAATTAATTCCC 
CTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCT 

Sequence ID 622 SEQ ID NO: 224 

TTTTTCTTGTTTTTGTGTGTCTACCTTGGCATATACTAAAGGAAGGTGTGTATTCA 
TTTATTACATGATATCTCTGGGTTATAATTATTTACATATATGAATTTGAAAGAAA 
GATTGAGAGGGATATGTGTGACCTTTGTTTCATTATGATCATTTACATGACTAAAG 
ATAAAGATCATATGTCTGATTTTCAGTTTAATGGCAAGTTACTTAAAATAAATGAA 
ATATGTTTTTATTGTTTTCGTGGGTTTGATGCTTTGTGTTTTATTTCAAGTAACTT 
GAGAATGCATTGTGTTTGGTACTGTTTTTTATGAATATCATTAAAAATTTATTTAA 
GGAGAGAG T AAT T T T GC AAT AAT AT TTTT GAT T T AT T T GAAAAT AAAAT T CAAGAT 
AAATGAAATAATTGAAATTTTCTAAAGAAGGAATTGAATATATTTTTACATTTGAA 
TGAACTAAGGATTAACTGAACCATTTATATATAGTACTTTCAGAACTGAATGTCTT 
AAAT G AT AAAG C T C T AAT T GG T T AAAG T G AC T T T C T T T C AAG T C AAAG AAC C C AG A 
AAC T GAAT AGAT GAT C T AAC T AC T GCC AC T GAGGT T T T GGAT TAG T GAGT AT AAAT 
TT 

Sequence ID 624 SEQ ID NO: 225 
TGCAGGATCCGTCGACT 

Sequence ID 625 SEQ ID NO: 226 

GACAATCAGAGCAGATCTTGGGCTTCTGTGGCTCATCTCAGCCCTTTATAACTGGC 
CTGAGAAGAGGGTTTATCTACTTGTGCAAGTGGCCCAGAAATCTCACTCGTACATG 
AGGC T T T GGAAC AT CC T T GC AAAGG T AC GC T GAAAGC AAAT TGCTGTTTTCCTGGT 
GGTTCTGCACGTTTCCTAACTTTTATCATAGTTTGATTTTCATTATTTAAGAAAAA 
A T A A A A A A T C C A A AG AC C AT AAGAT GGC AT T AGAT TTTT T ACC AT T AAAT T AT T AA 
TGCCTATTTGGTGCTCATAAAGATTAATCATGTCACGCATGTTTCCAATCTTTCTT 
TTGCAGTATATTATTTTCTAAAAATTGTTACATGCAAATTTAAACCAAGATTTATC 
AGTA 
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Sequence ID 626 SEQ ID NO: 227 

T T G G A AG A A A T A A AC C A AG G C AG A A A A A T T T T AAA T G G C C A A A A T AAA T T GT AT T G 
C T AAC T T AGAT GGC C AC AGAT GGGGGC AGGGG T GG AGAGAGGAGAAAT T GAAAACN 
C C AC AAAG AC C CC GC AAT GGC T AGAAC T T GAAAT C T C T GGAT AT T GC AAC AAT AGC 
AGC C T C C T T AAGT C AGC AAAAAGAT AAAGAT T GAT CC AAT GT T C TAT AT T AC AGAA 
CAGAGCAGATTGTCAATATAGCAAATAAAGTTACCGTTGAGTGGACTGCGCTGTNT 
AAGC T GCTTGGTTGGCCTT AAGT GCCGAC AAT TAAGAGATGAAGGC AAT GAGAACT 
GAAACAAACATTTAAGTTCAAGACCCAGTTTACTGACACTGGGACTATTACTATAT 
CTCTTTGGGCCTCAGTTTACTTATCTGTAACATTAAGAGGTTGGATTACATGATGT 
CTCACGATTCTTTTTTTTTATTTAGAGATGGGGTTTTGCTCTGTTGCCCAGGCTGG 
AGTGCAGTGGCATGATCATAGCTCACAGCAG 

Sequence ID 627 SEQ ID NO: 228 

CCAGCCTGTCACTGGCCTGGCCAAGGAGGAGAGACAGGCCAGGGATTCTGGTCCTA 
ACTCTACTGGCCACACTGTGTGGCCTGAGACCCCCCTTTCCCTCCCAAGCCCCTGC 
C T CCGCAT CTGCGTGGTGAAGGCCATTGGCCCT CAT CGGT GGAT CTGCGTTTCCTC 
GGGCCTACACTGTCTAGGATTGTGCGGGGCTGGTGAGAGAACAAGATCTCTTCCGT 
GTTCAAGGCAGACTTCCTGCCCCCTGCACCCTGCTCTCTCCCAGGCCTTGAGGTCA 
GTGTGAGCCCCAAGGGCAAGAACACTTCTGGAAGGGAGAGTGGATTTGGCTGGGCC 
AT C T GGAT GGAAGGT AAAAAAAAGAAAAT C C C T T G AAAG GAG AT T GAGGGAAGT T T 

Sequence ID &2-8- SEQ ID NO: 229 nt : 

419 

AAGAGAAAGGACTCAGTGTGTGATCCGGTTTCTTTTTGCTCGCCCCTGTTTTTTGT 
AGAATCTCTTCATGCTTGACATACCTACCAGTATTATTCCCGACGACACATATACA 
TATGAGAATATACCTTATTTATTTTTGTGTAGGTGTCTGCCTTCACAAATGTCATT 
G T C T AC T C C T AG AAG AAC C AAAT AC C T C AAT TTTTGTTTTT GAG T AC T G T AC T AT C 
CTGTAAATATATCTTAAGCAGGTTTGTTTTCAGCACTGATGGAAAATACCAGTGTT 
GGGTTTTTTTTTAGTTGCCAACAGTTGTATGTTTGCTGATTATTTATGACCTGAAA 
TAATATATTTCTTCTTCTAAGAAGACATTTTGTTACATAAGGATGACTTTTTTATA 
C AAT GGGAAT AAAT TAT GGC AT T T T T T 

Sequence ID 629 SEQ ID NO: 230 

CTGAGAGTCACTGTGTTTTTAGCCAAATCTAAGGGAGAAAATGAATATTGATAGCA 
GCATGCTGTAGCCAGCTCCTTAAAGGAAGGATGGTGCCTGGTACAGAGTTAGAGTT 
AGTGCTTCAGTAAATAATGAATGTGTGCTAGGTAGGTTCTGCTGGGTAGGCTGCAT 
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GCATTGACCAATTTATTCCTCCTTGTTTCAAAACAGGATTTAAGGGCACTTATATA 
TATATATTTTTTAGTTTTTTTAATGTAAATGAGAGAATAAAGATATATATATATGT 
CTATATATGTATATATGTATATATATGTCTATATGTCTATATGTATATATGTCTAT 
ATGTATATATGTGTGTGTGTATATATATATATATATATATAAGTTTTCTGTTGCTA 
GC AT AAC AAAC T AC C AGAAAC T T AGC AAC T GAAAC AAC AT GAAT T TAT C T T ACGG T 
TCTATAGTTCAGAAGTCTAACGTGTCACTGGGATGAAATCCAGGTTTCAACAGGAC 
TGGGTTCCCTTCTAGCTCATTCAGCTACCTGGCTCATTCAGGTTGTNGGCAGAATA 
TAC T T CCAT GAAAC TGTAGGGCTGAGACCCCGTTCCTTCCTGGC TAT CAT CTGAAA 
ACTTTC 

Sequence ID 63Q SEQ ID NO: 231 

AGGCGCAGCCCAGCCTCGAAATGCAGAACGACGCCGGCGAGTTCGTGGACCTGTAC 
GTGCCGCGGAAATGCTCCGCTAGCAATCGCATCATCGGTGCCAAGGACCACGCATC 
CATCCAGATGAACGTGGCCGAGGTTGACAAGGTCACAGGCAGGTTTAATGGCCAGT 
TTAAAACTTATGCTATCTGCGGGGCCATTCGTAGGATGGGTGAGTCAGATGATTCC 
ATTCTCCGATTGGCCAAGGCCGATGGCATCGTCTCAAAGAACTTTTGACTGGAGAG 
AAT C AC AG AT G T GG AAT AT T T G T C AT AAAT AAAT AAT GAAAAC C T A A A A A A A A A A A 
AAAAAAAAAAAAAA 

Sequence ID 631 SEQ ID NO: 232 

TNCACTCACACACTCCCAAACCTTAACAAACACATACATGTGCAGCCAACCCAATG 
GGCCAGCCTCTTTTATGCTCCTCACATGTTTCCTTTAACTGGAATACCCATGACAG 
CTCCCTACATAGTTACTTGTAAACTCCTCCTCTCTGTATAAGTTTTCCTGAATTTT 
TTTGATAAAATTAAGTTGTGCCACCCCTTTATGCTCTCTTANAACTTTGTTCTGTT 
CTCATGGCTGTTCTGCAACGAATCTCATTGTGTTCTCCTACTCAATTACATTCCTG 
CGTCTCCCACTAGATGGCAGACTCTTTGAGAGTAGGAGATTCCCTTGTTATCTCTG 
GAT C CC T GGC AC T T GC AG AAAGC C T GT T AC GT AAT AAT T GC T C AAC AAT TAG T T T T 
T AAA T AAA T G A A TTATTTTT A A A A C G C C A A A A T T A C A A TGATTGTGCATT A A G T G A 
AAGAT GAC CAT C T AAAAAC AT AAAGCC AT GC T T CAT GAC AT T GGC 

Sequence ID 632 SEQ ID NO: 233 

GACCATTCAGGGAAATTTTATAAAAAATGCAGATACTGTCTTGAGCAGATCGAAAT 
GCCGATGAGGTGGATGCAATTTCCTTTTGTGCAAGCAGTGCACGGTGCCCCCCCCT 
CGGGTGTCCGTGCTGTGCCTTAGCTTCCCCAGGTGCCGGGACTCACACCTGCTAGG 
GGCTGGGCAAGGCCCCGGCTCTGCTTTCTCTGAAGGGCTTGTCCAAGTTCATTGCC 
CTGTTACAGGTGGTCAAGACGTCCGGCCGCCTTGACCCAGGCTACCCTTAGCCAAT 
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ATCCTCTGCCCCTGGGTGGTTGGTGGCTGGGCCTCAGGGTGGGCAACGTTAGGGGT 
TTGGCGAAAGCCCGCCCCATGGGATTGAGGGACGGGGCTGCACTCCAACCGTCTGC 
ACCTGCTCTTCCCCCACCCCTGTGGGACCTCATCTTCACGTGCCATGTGTGCTGAA 
GGCCCAGGGCCCAGCAGGGGGCAGTGGCACCTGTTGACGGAAAAGCCGAGGTGCTT 
ACCAATGGACCTTCTGGCCCGCCCTCCCCTGTACTTGTCGGGCATTCAGGGCCCCG 
ACCTGTGCCTACCCGCA 

Sequence ID 633 SEQ ID NO: 234 

CAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGG 
AGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGT 
GAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTC 
CTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTC 
ATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTC 
AAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCC 
TGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTG 
GCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCANAAAGTGGTGGCTGGTGTG 
GGCTAATGCCTGGCCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATTTC 
TATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAA 
GGGCCTTG 

Sequence ID ^3-4 SEQ ID NO: 235 nt : 

511 

TTTTTTAATTTCACCAAAATTTGTTGACGTCCCTTGATTTGCTGATAGGGACAATA 
ATTAAATATTTTCCACTTGTTTTTATAAAAACTGTAATGGTGATTTGTTTAACAGA 
TGTTGACTTAGCACCTTCTCTCTTTTTTTTTTTTTTTTTTTGAGTTGGAGTCTTGC 
TCTGTCACCCAGCTGGAGTGCAGTGGCACGATTTCGGCTCACTGCAACCTCCGCCT 
CCCAGGTTCGGGCGCTTCTCCTGCCTCAGCCTCCCANATAGTTGGGATTACAGGTG 
CATGCCGCCACNCCTAGCTAATGTTTTTTGTATCTTGGTANANATGGNGTTTCACC 
TTGTTGCCCATGCCGCTCTTGAACTCCTTGGCCTCCCAAAGTGTTAGGATTACAGG 
CGTGAGCCACTGTGCCTGGCCCCAATTTANCACCTTACTGGGTGCTGAGGCTGTGA 
GCCATAGTAGAATGCATGTGATCCAGGGCCTTGCTGAATTCATGGGCTAATAGGGA 
GCCTGAC 



Sequence ID £^5- SEQ ID NO: 236 nt : 

592 

TGAGCGTTGGGCTGTAGGTCGCTGTGCTGTGTGATCCCCCAGAGCCATGCCCGAGA 
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TAGTGGATACCTGTTCGTTGGCCTCTCCGGCTTCCGTCTGCCGGACCAAGCACCTG 
CACCTGCGCTGCAGCGTCGACTTTACTCGCCGGACGCTGACCGGGACTGCTGCTCT 
CACGGTCCAGTCTCAGGAGGACAATCTGCGCAGCCTGGTTTTGGATACAAAGGACC 
T T AC AAT AGAAAAAG T AG T GAT C AAT GG AC AAGAAGT C AAAT AT GC T C T T GG AGAA 
AGACAAAGTTACAAGGGATCGCCAATGGAAATCTCTCTTCCTATCGCTTTGAGCAA 
AAATCAAGAAATTGTTATAGAAATTTCTTTTGAGACCTCTCCAAAATCTTCTGCTC 
TCCAGTGGCTCACTCCTGAACAGACTTCTGGGAAGGAACACCCATATCTCTTTAGT 
CAGTGCCAGGCCATCCACTGCAGAGCAATCCTTCCTTGTCAGGACACTCCTTCTGN 
GAAATTAACCTATACTGCAGAGGTGTCTGTCCCTAAAGAACTGGTGGCACTTATGA 
GTGCTATTCGTGATG G AG AAAC AC C T G AC CCA 

Sequence ID — 636 SEQ ID NO: 237 nt : 

572 

CTTANAAGAGTTGCTCATTCACACCCACGCCCTTGCCCAAGGCTGGCCCACTCAGA 
GCGAAACTTAACTTTTGTCTGGATGGGAAGAGAAGTAAGTCTACCCCGAGGTTGCC 
ATGTTGAAGAGTGAGAGGTCCAAGTGATTCTGTGCATTGAAACCAAGACACCCCAC 
CCAGAACACTTCTTCCCTCCCTCAGCCCAAACCAAAGGCTGGGGTTCTCATCTCCA 
AGTGGCTGTTCTCCAACTTTCCCAAGCCGCTTGCATTCCCCAGACTGGACTACTGT 
GGCGGTTAGGTTAGATTTGAAGACGGGGCCCAGGCTGGGTATGAACGGGTGCAGCC 
CTCTTCTCCTCTTCCCCCCCACATCTCTCATGAGAGAGGTAGTGGCATTTCCTTCT 
C AGGGAGC T T C AAT GGGAAAGGT C T CGAAAGC T T C AGGAGGAGC AGAAT AC C AAC G 
CAGGGGGATGGCTGTAACGATCTCACCGTCTCCTAACCTCAGTCCCTTTTTTGAGA 
GTGAATGGTGGAGGGTGGGAAAGGGACCCAAATTTGTAGATCTCTTTGTCTGGGGG 
AGGGGAANGATG 

Sequence ID — 637 SEQ ID NO: 238 nt : 

482 

T T AAAAC AG G C G C AG G G G T AA AAA T GAG AAT G AA T C T G AAAA AAG AG AG T T G G T G T 
T T AAAGAGGAT GGAC AAG AGT AT GC T C AGGT AAT C AAAAT GT T GGGAAAT GG AC G A 
TTGGAAGCATTGTGTTTTGATGGTGTAAAGAGGTTATGCCATATCAGAGGGAAATT 
GAGAAAAAAGGTTTGGATAAATACATCAGACATTATATTGGTTGGTCTACGGGACT 
ATCAGGATAACAAAGCTGATGTAATTTTAAAGTACAATGCAGATGAAGCTAGAAGC 
CTGAAGGCATATGGCGAGCTTCCAGAACATGCTAAAATCAATGAAACAGACACATT 
TGGTCCTGGAGATGATGATGAAATCCAGTTTGACGATATTGGAGATGATGATGAAG 
AC AT T GAT GAT AT C T AAAT T G AAC C AAG TGTTTTTACAT G AC AAG T T C T C T G AGG A 
TGGTTCTACAGTTGGGATTTTGGCCATCATCAAC 
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Sequence ID £^£ SEQ ID NO: 239 nt : 

545 

TTTGAAGGCAAAGAGGGATTAATCTGTGCTGGCATCATGTAAGGAGACTTGATAGA 
T A AG A A A A AG CTTTACCT A AG T T T T G A AG A A TAGGTTTTTCAT A A T G G A A A A T T T A 
AGGGAAAAATCTCCAAAAAAGTGCTACTCAAGTTTTATCCATTTGTATTTCCAACA 
CAGCCTAGGACAGTACCTGCACATAGTAGGTGATTAATAAAAATTTAGAAAGCATT 
AATACTAAAGAGGAAAAATAGCAATGGCAAGAAAACACATGTAGGGAACACATGTA 
GCCAAAAAATAATATATAATCAGAGAAATAATAGGACTTCTGGAAAAAAAAGATGA 
GATCAGATTGGTTAGGATCTTTACTAACATGACAAGAGCATGAATTTTTTTTCTGT 
AGAT AAT AAGT AT GAAAGAAT T TTAGCT TAAAAAT TAGCATAAT T T GGAT CC AC AT 
ATGCAAATCAATGAATGTAATTCATAATATAAACAGAACTAAACACAAAAACCACG 
T GAT T AT C T C AAT AG AC AC AG AAAAGGC C T T C AAAAAAAT T 

Sequence ID — ■ 639 SEQ ID NO: 240 nt : 

624 

GACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGT 
CAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTG 
CAGTGCTCTGAGCCCTAGGATTCATCTTTCTTTTCACCGTAGGTGGCCTGACTGGC 
ATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGT 
AGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCT 
TCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAA 
ATCCATTTCACTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTT 
TCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTACCCCGATGCATACACCA 
CATGAAACATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTA 
ATAATTTTCATGATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGA 
AGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACAT 
TCGAAGAA 

Sequence ID 641 SEQ ID NO: 241 

CAAGATGACAAAGAAAAGAAGGAACAATGGTCGTGCCAAAAAGGGCCGCGGCCACG 
TGCAGCCTATTCGCTGCACTAACTGTGCCCGATGCGTGCCCAAGGACAAGGCCATT 
AAGAAATTCGTCATTCGAAACATAGTGGAGGCCGCAGCAGTCAGGGACATTTCTGA 
AGCGAGCGTCTTCGATGCCTATGTGCTTCCCAAGCTGTATGTGAAGCTACATTACT 
GTGTGAGTTGTGCAATTCACAGCAAAGTAGTCAGGAATCGATCTCGTGAAGCCCGC 
AAGGACCGAACACCCCCACCCCGATTTAGACCTGCGGGTGCTGCCCCACGTCCCCC 
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ACCAAAGCCCATGTAAGGAGCTGAGTTCTTAAAGACTGAAGACAGGCTATTCTCTG 
GAGAAAAAT AAAAT GGAAAT T GT AC T T AA 

Sequence ID 642 SEQ ID NO: 242 

TGCTTGGCCCTCTACCTCCTGCCCTCTTCCTGTTCATCTCCCAACCACTGCACTCT 
TGATTTTTATACCACACAGAAGGTAAGAAAATTCTAGGAACCCTAAGGATCAATCC 
TCTCCATTTTCACTCAAATGCCTGGGGCCCAGCTCTGCAATGACTGACTCCAGGGC 
CTCTTTCCTCACTGCCAGCATAGAAGTCAGGGGAGCCAGCTGGGCCCTGCGGTCAG 
GAAGGTTCTCATTTTTGGAGCATTCCCTGAGCCCAGATCATAGGAGCAGCTGTCCC 
TGGTGGGACACAGGAGTCATGACTCCTACCCTCCACCCTCCACACCCACCAGGCAT 
TTAGCAGTCTGTCCTATGCAAGACAGATGAATTCTCAGCCAGGATACCTCAAGGCA 
GGC AAAGG T GAG T GGAGGG AAAAT T C AC AAAC AT T C AGGGT G T GT GGT GC T GGC AT 
CACCATGGCC AAA T C C A A GAG GTCTTCCTG G A AG A G G G C C C A A AC T G G A AC C A A A A 
GAATGCTGTCAGCAGTTGGAATAGAGCTGTGAATT 

Sequence ID 643 SEQ ID NO: 243 

CTTTCCAAGAGGAATCCTCGGCAGATAAACTGGACTGTCCTCTACAGAAGGAAGCA 
C AAAAAGG G AC AG T C GG AAG AAAT T C AAAAG AAAAG AAC C C G C C G AGC AG T C AAAT 
TCCAGAGGGCCATTACTGGTGCATCTCTTGCTGATATAATGGCCAAGAGGAATCAG 
AAACC T GAAGT T AGAAAGGC T C AAC GAG AAC AAG C TAT C AGGGC T GC T AAGGAAGC 
AAAAAAGGCTAAGCAAGCATCTAAAAAGACTGCAATGGCTGCTGCTAAGGCACCTA 
CAAAGGCAGCACCTAAGCAAAAGATTGTGAAGCCTGTGAAAGTTTCAGCTCCCCGA 
GTTGGTGGAAAACGCTAAACTGGCAGATTAGATTTTTAAATAAAGATTGGATTATA 
ACTCT 

Sequence ID 644 SEQ ID NO: 244 

C T T T GAT AGAG AAG AAAAT T C T C C T AGG AT AC AAG AGC C T C AAC AT T T T AAAGAT T 
TTCTGCATCTCAAAAGCGTAGGCTCCTTGCTGGGCAAGGTGAGCCTCTGTGAGTCC 
TCATAGGACCGAGCAAATCTGATTCACCCCAGAAAATCCAATATCGAAGCTGAGCT 
TTGGCCTGAGCGGGTTCCATTTCCTCCCCAGATCCTATTTAGGAAGTGTCTCCTGA 
CAACCTCCAAAAGGTGCTAACATGCAACGTTCTGAAGGGTTATTGCTCAAAAACAA 
GATTTTCCTTGTGGTCAAGACTCTGCGAGCCTCGAACACGATGAATCCGCTCGAAT 
GGGCTTGGGCTTTGCCCGGGTGGCGCACGCTCACACGCTGGAAGCACAGCTTTGAC 
GATCTCCACACACGCACAGGCACACACGCCACAGATGATGCCGGCTCATTCTCAGG 
GGGTGTCTAAGTTCTGCTTTAAATATTTACCCCCTAATTGTACAAACAATAGGGGC 



- 204- 

Marked-Up Copy 

ATGAGCCTGGTACTCGATAAATGGGGACTTNCTTAAAA 

Sequence ID — 645 SEQ ID NO: 245 nt : 

649 

CTACAGCCTGGGCAGCGCGCTGCGCCCCAGCACCAGCCGCAGCCTCTACGCCTCGT 
CCCCGGGCGGCGTGTATGCCACGCGCTCCTCTGCCGTGCGCCTGCGGAGCAGCGTG 
CCCGGGGTGCGGCTCCTGCAGGACTCGGTGGACTTCTCGCTGGCCGACGCCATCAA 
CACCGAGTTCAAGAACACCCGCACCAACGAGAAGGTGGAGCTGCAGGAGCTGAATG 
ACCGCTTCGCCAACTACATCGACAAGGTGCGCTTCCTGGAGCAGCAGAATAAGATC 
CTGCTGGCCGAGCTCGAGCAGCTCAAGGGCCAAGGCAAGTCGCGCCTGGGGGACCT 
CTACGAGGAGGAGATGCGGGAGCTGCGCCGGCAGGTGGACCAGCTAACCAACGACA 
AAGCCCGCGTCGAGGTGGAGCGCGACAACCTGGCCGAGGACATCATGCGCCTCCGG 
G AGAAAT T GC AGGAGGAG AT GC T T C AGAGAGAGGAAGC CGAAAAC AC C C T GC AAT C 
TTTCAGACAGGAAATCCAGGAGCTGCAGGCTCAGATTCAGGAACAGCATGTCCAAA 
TCGATGTGGATGTTTCCAAGCCTGACCTCACGGCTGCCTTGCGTGACGTACGTANC 
AAT AT G AAA GTGTGGCTGCC A A A A A C C T T GC AG 

Sequence ID £4r£ SEQ ID NO: 246 nt : 

600 

GAGATGTCTCGCTCCGTGGCCTTAGCTGTGCTCGCGCTACTCTCTCTTTCTGGCCT 
GGAGGCTATCCAGCGTACTCCAAAGATTCAGGTTTACTCACGTCATCCAGCAGAGA 
ATGGAAAGTCAAATTTCCTGAATTGCTATGTGTCTGGGTTTCATCCATCCGACATT 
GAAG T T G AC T T AC T G AAG AAT GG AG AG AG AAT T GAAAAAG T GG AG C AT T C AG AC T T 
GTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTACACTGAATTCACCCCCA 
CTGAAAAAGATGAGTATGCCTGCCGTGTGAACCATGTGACTTTGTCACAGCCCAAG 
ATAGTTAAGTGGGATCGAGACATGTAAGCAGCATCATGGAGGTTTGAAGATGCCGC 
ATTTGGATTGGATGAATTCCAAATTCTGCTTGCTTGCTTTTTAATATTGATATGCT 
TAT AC AC T T AC AC T T T AT GC AC AAAAT GT AGGGT T AT AAT AAT GT T AAC AT GGAC A 
T GAT C T T C T T T AT AAT T C T AC T T T GAG TGCTGTCTCCATGTTT GAT G T A T C T GAG C 
AGGGTGCTCCACAGGTAGCTCTAGGAGGGCTGGCAACTTA 

Sequence ID 647 SEQ ID NO: 247 

CGAATGTGCAGGTTTGTTACATAGGTATATATATGCCATGATGGAAATATTTATTT 
T T T T AAGCGT AAT T T T GCC AAAT AAT AAAAAC AGAAGGAAAT T GAGAT T AGAGGGA 
GGTGTTTAAAGAGAGGTTATAGAGTAGAAGATTTGATGCTGGAGAGGTTAAGGTGC 
AATAAGAATTTAGGGAGAAATGTTGTTCATTATTGGAGGGTAAATGATGTGGTGCC 
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TGAGGTCTGTACGTTACCTCTTAACAATTTCTGTCCTTCAGATGGAAACTCTTTAA 
CTTCTCGTAAAAGTCATATACCTATATAATAAAGCTACTGATTTCCAAAAA 

Sequence ID 648 SEQ ID NO: 248 

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

Sequence ID ■ — &4-9 SEQ ID NO: 249 nt : 

425 

CAAAAAAACGAAGAAAAGTGACGACAGTCTGAGGGACTTATGGGAGATCATCAAGT 
GAACCACTATATGTGTAATGTAAGTCTTGGAATGAGAAGAGAGAAGGAGAAGGAGG 
AGAG AGC T TAT T T GT AGAAAT AAT GGC T GAAAAC AT C C C AAAC TTTCCTTTTTTTG 
AGGAAAGAAAT AGGC AT AC AAG T T C AAG AAAC T C AAGG AAC T CC AGAGAGGAC AAT 
T C T AAAGAC AC CC C C T C T AAC AT AC AT T AT AAT C AAAT T GT C AAAAGT AAAAT AC A 
AAGAGAATCTTTTAAATTGACAAGAGAAAAGCAGCTGGTCACGTTCAAGGGAGTTC 
TATAAGAATTTCAGCAGATTTCTCAGCAGAAACCTTGCAGGCCAACAGGCAGTGGG 
AT GAT AC AT T CAAAG T GC AAAAAAAAAAAAAAA 

Sequence ID 65Q SEQ ID NO: 250 

C GAG AG T T T AC C AG T N G C C T AA T AA T G C AA T AAAA AA T G C T T T GAG AT AG C T AAC N 
GCCCATAAAACAAACTCAAATTGCTTATAAAGTTTCTTCCCATGTTCCCATTTGAT 
G AAAAGT C T T AC AT C AC AT AT AAC T GGG AAGC AGGGGT CC C T CC T C AAT T T T C AG A 
CATTTTGAAAGGATGACAGTTCTGTTTGTTAGATGAGTAAACCTCTATATTCATAA 
GTTCTAAAATCCTTCATTATGAGGGATTCAAAGTATTTATAAAAACACTGCCCTCT 
AAAAATTTCCTCAGATCTGAAGTATGGNCTTGGNCCTGAATATACAGTGTTATCCT 
ATGTTTAAAAGGGTGATCCAGACATGAGACGCAACTAGTTGGTGCATAAGAAGGCC 
CCACTTGGCTATTTCATATCTACCTACAATTGACCAAAAAAAATTTTTTAGGCCAG 
CAATTATTATTTAGCTTCGCTCTTTCTAGTGCAAGAAACTGCAGGCTGGATCAGTA 
G T T C AAC AGC T AAAC AG T CAT AAAAT AG T CAT T GGC AT G T T AAAT T T C T T T C AAT G 
C T T C AAAGAT AAAT T CC AAT T C T AT T T AC T T AT T CAT T GNGACNGNAT T AC T AAAC 
AGGTAAGGATGGGAATA 

Sequence ID &5^ SEQ ID NO: 251 nt : 

251 

CTTTGGGAGGCCGAGGCGGGCGGATCACTTGAGGTCAGGGGTTCGAGACCAGTCTG 
GCCAACATGGTGAAACCCCAACTCTACTAAAAATACAAAAGTTAGCCAAGTGTGGT 
GGCAAGTGCCTGTAATCCCAGCTACTCGGGAGGCTGAGACAGGAGAATCACTTTGA 
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ACCTGGGAGGCGGAGGTTGCAGTGAGCCAAGATCGTGCCACTGCACTTCAGCCTGG 
G C AAC AG AGC AAG AT T C C G T C CAT C T C 

Sequence ID 652 SEQ ID NO: 252 

C T T T C T T C AGC C T T GC AG AC ACC T AAAC AT C AT GT AAT T AC C T AAGGAAT T C CC AA 
GTGCCTCTTCCAGGTTATACGTGTAAATAGCTGTTTTTATGCAAGATTAGTTAGAT 
ACTGCTCTTTACAGGATGAGTGGTGTTGTCTTTGGCTGGGGGGGNCTTAAATGTGT 
TTCTAATGTGTGTGTCAAATAATTACCTGTTAAACAGACTGCCAATCTGGCTGAAG 
CCAATGCTTCTGAAGAAGATAAAATTAAAGCAATGATGTCGCAATCTGGCCATGAA 
TACGACCCAATCAATTACATGAAGAAACCTCTAGGTCCACCACCTCCATCTTACAC 
GTGTTTCCGTTGTGGTAAACCTGGACATTATATTAAGAATTGCCCAACAAATGGGG 
AT AAAAAC T T T G AAT C T GG T C C TAG GAT T AAAAAG AGC AC T GG AAT T C C C AG AAG T 
T T CAT G AT GG AAG T G AAAG AT C C T AAT AT G AAAGG T GC AAT GC T T AC C AAC AC T GG 
AAAATATGCAATCCAACTATAGATGCAGAAGCATATGCAATTGGGAAGAAAGAGAA 
ACCTCCTTNTTACCAGAGAGCCATCTTNTTTCT 

Sequence ID 653 SEQ ID NO: 253 

GTTGTGACTCGTTGGCATGTGATCTGAAGTTCCTGCCCTGCAGCTGACGAGCCAGT 
GT TTCAAT AAT T AAAAAC AAC TCAACTC ACT GTCCTCCTGCCTTGAATTTGATC AT 
TGCGCTTTGCATGTATGTATCACAATACCACATGTACCCCATAAATATGTACAAAG 
ATTATGTGT C A A T A A A A A AC A A A A A T TAAAAT C C C AAT T T T T A 

Sequence ID 654 SEQ ID NO: 254 

GTTGCTAGTAGCGGCAGGAAGATGTCAGGCTCACTTTCCTCTGATTCCCGAAATGG 
GGGGAACCTCTAACCATAAAGGAATGGTAGAACAGTCCATTCCTCGGATCAGAGAA 
AAATGCAGACATGGTGTCACCTGGATTTTTTTCTGCCCATGAATGTTGCCAGTCAG 
TACCTGTCCTCCTTGTTTCTCTATTTTTGGTTATGAATGTTGGGGTTACCACCTGC 
ATTTAGGGGAAAATTGTGTTCTG 

Sequence ID 655 SEQ ID NO: 255 

GTCCCCGGGAATCGCGGCCGCGTCGACGGTTTATTTTCAGTGCTTGAAGATACATT 
CACAAATACTTGGTTTGGGAAGACACCGTTTAATTTTAAGTTAACTTGCATGTTGT 
AAATGCGTT T T AT GT T T AAA T AAAG AG G AAAA T T T T T T GAAAAAAAAAAAAAAAAA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAAAAAAAAAAT T T T T 
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Sequence ID 656 SEQ ID NO: 256 

TAGAGGCCTGAATAGGTAGACAATGGCAGCAGCGTTTTTAATCACAGTCCTATTCA 
TGCCCTAATTCGGGAGTGATGATTAAAGGACATTAGAGGGAGCACTTTGACATCTG 
ATCCTTTGAACTGACGTCTGTGCAGGCTGCACTCCATAGAGCTCACTTGGCCAAAC 
T GAT T T C C T T AAAT AAAG T GC T GT GAT T T C C AAT G T AGGAAAT AT T AC AT T AGAGC 
CTATTGAAATGATTAGGAATTGAGGAGCTTTTCTTTAGGTGGGAATGTGGTGTATG 
CTGTATACTCACAAAAGTGAGATCATTAATATTGCATGTACTACTTTGAATATCAG 
GGACCACAGAGAAATAGCATGAGAAACGCCTTCCTGCAGTCATGCACTTAAAATGA 
ATATGAACAAAAATGTGGAACTCTGCTGTCATAGCTCTCCG 

Sequence ID 657 SEQ ID NO: 257 

GGGGCTTCTGCTGAGGGGGCAGGCGGAGCTTGAGGAAACCGCAGATAAGTTTTTTT 
C T C T T T GAAAG AT AGAGAT T AAT AC AAC T C T T AAAAAAT AT AGT C AAT AGGT T AC T 
AAGATATTGCTTAGCGTTAAGTTTTTAACGTAATTTTAATAGCTTAAGATTTTAAG 
AGAAAAT AT GAAGAC T T AGAAGAGT AGC AT GAGGAAGGAAAAGAT AAAAGGT T T C T 
AAAACATGACGGAGGTTGAGATGAAGCTTCTTCATGGAGTAAAAAATGTATTTAAA 
AGAAAAT TGAGAGAAAGGACT AC AGAGCCCCGAATT AAT ACC AAT AGAAGGGC AAT 
GCTTTTAGATTAAAATGAAGGTGACTTAAACAGCTTAAAGTTTA 

Sequence ID 658 SEQ ID NO: 258 

G AC C T T T G AG AAAA T T AA T T T AAAT C C T AG AAC T T T G G G T G AAC C G AAG AAA T T T A 
TAATATTTGTTTAGTTAATAACAGATAAAAAGGAAAGATTCAAGCCTATTGGATGA 
GAATTTGTACATTATTTTAGAGCTAATAATAATGGTTTTCAGTTTAGTGAGGATTT 
AAAAAATGTTTTTGAATCAAACTTTTTTTCTTTATAATCCTTTTTAACTAACTCAG 
GAAATAAGGTATTATGAAATCCACACACTGTTACCTCCTTAAAGTATGAGGATACT 
TCCCACTGTTTGGTCCACTAGTGGCTGATTATTTTGTTTGTGGATTATTTGTAATT 
TTCTTTTTAATTCTTCCTTAAAGAGCATGGCATTTGGAGTCACAGACCTATATTTG 
AATCCTGTCATTTACTAGCGTTTTGACCTTGAACAATTATGCTCAGAGTCTCAGTT 
T T T T C T T G T AAAGT GAT GAT GAT AC T AC T T AAC T C AC AGGGT T GT AGT GAAGAT C A 
AATGAGATCATGTCTGTANAACACCCTGCCCGGCACTCAATAAGTATTAATAGGAA 
CCCATATACCTC 

Sequence ID 660 SEQ ID NO: 259 

T GT T T T TAT T T T T T AAAAGGT AT AAAC ACC AAAAAAAAAAT T AAC AT T GT AT GAAG 
A T G G AAAA T AAG AAG AT G C AC T T T C T G T AAC T T T G T C T AAG GAT T T AAA T T AC T AA 
CTTATGAACTCCAATTTGAATTGAACTTAACTATCGGCTTTCTTACTGGTAAAATT 
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ATATGGTTTATTTTAAATGCGTACATATTGACCAATGGCCTCTGAAAAAGCACATT 
TTAGATACTGAAATTGAAGGAAAGAAAATGCATCTTCAAACATTTTTTGGAATCTC 
ACCACATATACTTTGTTANATTTGTGTATTGTAGGGTGTTTGTTTTGTATTTTTGT 
AT T GT AT AT GAAC t T T TT T T AAAT GT GAC AGT T AAAC AC AT C T T T AAAAGC AT AGT 
CACAGACAAAAGCATACAGTATAAAAATTTCCTTGAAAACTCCTACAATATTATAT 
TTGGAGGCAGCTTCAGACTGTTTTATTGG 

Scqcuncc ID 661 SEQ ID NO: 260 

CTCTGGCACACATTAGTTCCTCTTATATTACATTGATATAAGCAAGTCATATGGAT 
TTATCTGAGTGTAAGGAGAGCTGGAAAAAATAGTTTCTAGCAGGTCAGCCACCTCC 
CAGTGAGGGCTGCATACCATAGAAGGGGAGAATGAATTTTGGGAAAACAGGTAATT 
AT C T C T GT C AC AGAAGGGGAT GAAAAGT AT GG T AG T T ACNC AAGT T AN AC AT C T G T 
ATGGAAAATACCACTTGGTTCTACAAATGNGG 

Sequence ID ■ — &^ -SEQ ID NO: 261 nt : 

627 

GCCTCCCGGGTTCAGGGATTTCTCCTGCCTCAGCCTCCTGAGTGGCTGCATTGCAG 
GCACCTGCCACCACGCCTTGCAAATTTTTGTGTTTTTAGTGGAGATGGGGTTTTGC 
CATGTTGGCCAGGCTGGTCTCGGACTCCTGACCTCAGGTGATCCGCCCGCCTCAGC 
CTCCCAGAGGGCTGGGATTACAGGCGTGAGCCACTGTGCCTGGCCCCAAGTTTTGC 
AT C T T T T AAT GCC C T C T G AAC AAAT AC AT AGAGAAAAC T C T C AGAAC AAT T AAAAC 
CTGCAGAGCAACAGTGTCCTCCATGTCTTAGGTTTCAAGTTTGCCTCTAAAATTCT 
AATCCATATTTTTCTACTTCTCAGATAATTTATGTGTGTGTACTCTTCCTAGACGT 
ACAAGAGACTTTTTAATGCTAAATATTTGTCAGTGCTTAACAAAAACTCAATTTCA 
CATTACTCATATTGTTTTTGTTTTAATTGAATGTGAATTAAATTTTTATTAGTTAT 
TTGATTTGGAATGTTATGTATGCCATTAACACTATTAGGGGAATCTCTAGCATTTC 
T GT AT T T T T AAAGAAT T T GAT T C T T T T G TAN AT T C T GC C T GT GT GGC AT T T T AAAC 
ATGTGTGACAT 

Sequence ID ID NO: 262 nt : 

345 

ACCGGCGACATGGCCAAACGTACCAAGAAAGTCGGGATCGTCGGTAAATACGGGAC 
CCGCTATGGGGCCTCCCTCCGGAAAATGGTGAAGAAAATTGAAATCAGCCAGCACG 
CCAAGTACACTTGCTCTTTCTGTGGCAAAACCAAGATGAAGAGACGAGCTGTGGGG 
ATCTGGCACTGTGGTTCCTGCATGAAGACAGTGGCTGGCGGTGCCTGGACGTACAA 
TACCACTTCCGCTGTCACGGTAAAGTCCGCCATCAGAAGACTGAAGGAGTTGAAAG 



- 20 9- 

Marked-Up Copy 

ACCAGTAGACGCTCCTCTACTCTTTGAGACATCACTGGCCTATAATAAATGGGTTA 
ATTTATGTA 

Sequence ID — 666 SEQ ID NO: 263 nt : 

252 

ATAATTCAGAACTTCTTCATATGCTCGAGTCTCCAGAGTCACTCCGTTCTAAGGTT 
GATGAAGCTGTAGCTGTACTACAAGCCCACCAAGCTAAAGAGGCTGCCCAGAAAGC 
AGTTAACAGTGCCACCGGTGTTCCAACTGTTTAAAATTGATCAGGGACCATGAAAA 
GAAAC TTGTGCTTCACC G AAG AAAAAT AT C T AAAC AT C GAAAAAC T T AAAT AT TAT 
G G AAAAAAAAC AT T G C AAAAT AT AAAAT 

Sequence ID 669 SEQ ID NO: 264 

TTACTTTTAACCAGNGAAATTGACCTGCCCGTGAANAGGCGGGCNTGACACAGCAA 
GACGAGAAGACCCTATGGAGCTTTAATTTATTAATGCAAACGGTACCTAACAAACC 
C AC AGGTCCT AAAC TACCAAACCTGC ATT AAAAAT TTCGGTTGGGGCGACCTCGGA 
GCAGAACCCAACCTCCGAGCAGTACATGCTAAGACTTCACCAGTCAAAGCGAACTA 
CTATACTCAATTGATCCAATAACTTGACCAACGGAACAAGTTACCCTAGGGATAAC 
AGCGCAATCCTATT 

Sequence ID 670 SEQ ID NO: 265 

GGCTGATTCCTGAGCTATAAAAGCATAATTGCTTTATATTTTGGATCATTTTTTAC 
TGGGGGCGGACTTGGGGGGGGTTGCATACAAAGATAACATATATATCCAACTTTCT 
GAAATGAAATGTTTTTAGATTACTTTTTCAACTGTAAATAATGTACATTTAATGTC 
ACAAGAAAAAAATGTCTTCTGCAAATTTTCTAGTATAACAGAAATTTTTGTAGATG 
AAAAAAATCATTATGTTTAGAGGTCTAATGCTATGTTTTCATATTACAGAGTGAAT 
TTGTATTTAAACAAAAATTTAAATTTTGGAATCCTCTAAACATTTTTGTATCTTTA 
AT T GGT T TAT TAT T AAAT AAAT CAT AT AAAAAT T 

Sequence ID 671 SEQ ID NO: 266 

CAGGAAGTCACCTGGGATTGGCTGCCTCACCCACTCACAGTGCCATCCCTGCCCCA 
GGCCTCCCAGTGGCAATTCCAAACCTGGGTCCCTCCCTGAGCTCTCTGCCTTCTGC 
TCTGTCTTTAATGCTACCAATGGGTATTGGGGATCGAGGGGTGATGTGTGGGTTAC 
CTGAAAGAAACTACACCCTACCTCCACCACCTTACCCTCACCTGGAGAGCAGTTAT 
TTCAGAACCATTCTACCTGGCATTTTATCTTATTTAGCTGACAGACCACCTCCACA 
GTACATCCACCCTAACTCTATAAATGTTGATGGTAATACAGCATTATCTATCACCA 
ATAACCCTTCAGCACTA 
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Sequence ID 672 SEQ ID NO: 267 

CAGGAAGTCACCTGGGATTGGCTGCCTCACCCACTCACAGTGCCATCCCTGCCCCA 
GGCCTCCCAGTGGCAATTCCAAACCTGGGTCCCTCCCTGAGCTCTCTGCCTTCTGC 
TCTGTCTTTAATGCTACCAATGGGTATTGGGGATCGAGGGGTGATGTGTGGGTTAC 
CTGAAAGAAACTACACCCTACCTCCACCACCTTACCCTCACCTGGAGAGCAGTTAT 
TTCANAACCATTCTACCTGGCATTTTATCTTATTTAGCTGACAGACCACCTCCACA 
GTACATCCACCCTAACTCTATAAATGTTGATGGTAATACAGCATTATCTATCACCA 
ATAACCCTTCAGCACTAGATCCCTATCAGTCCAATGGAAATGTTGGATTANAACCA 
GGCATTGTTTCAATANACTCTCGCTCTGTGAACACACATGG 

Sequence ID 673 SEQ ID NO: 268 

GGGTTTTCTTTCGGAAGCGCGCCTTGTGTTGGTACCCGGGAATTCGCGGCCGCGTC 
GACTGCTAAACAGAATACTGCTATTTTGAGAGAGTCAAGACTCTTTCTTAAGGGCC 
AAGAAAGCCACNTGNNCCCTNGGNCTAATCTGGCTGAGTAGTCAGTTATAAAAGCC 
NTAATNGCTTNNTNTTTGGNNTCNTTTTTNNCNGGGGNCGGNCTTGGGGGGGGTTG 
CNTCCAAAGATANCATNTNTTTCCAACTTTNTNAANNNAANNGTTTTAAAATCCCT 
TTTCCNCCN G A A A AN ANN G C C C T T T A AGN G C C N C AAA A A A A A ANN GTNTTCTGC AN 
NT T T T C T ANT ATNAC AAANNT T T TNGT AGAANAAAAAT t T T T T T T T AGNGGC T ACC 
CTTTNTTTNTT ANN C ANN G GAG TTTNTTTTT AC AA AAAAAAAAN A T T G G GN C C C C T 
CCACAACCTTGGGTCTNTAATNGGGGGGTTTTTAAATAAANCNTNTNTAAATCCCC 
CNNNNNNNNNCNNNNNNNNNCCNNNNNNNNNNNNNNCCCNNNNAAAAAAT T T T TNC 
TCCCCCNCCCTTTTTCTTCCTGCCGGCCCCAATTTAAGCCCNGGCGCTTGGGGCAA 
ATCCCCCTTTAGNGGGGGGGTTTANAAAAACCNGGGGCGGGGNTTTAAAACCNCGG 
GGNNNGGGGAA 

Sequence ID 674 SEQ ID NO: 269 

ACCTCTAGCATCACCAGTATTAGAGGCACCGCCTGCCCAGTGACACATG-TTTAAC 
GGCCGCGGTACCCTAACCGTGCAAAGGTAGCATAATCACTTGTTCCTTAATTAGGG 
ACCTGTNTGAATGGCTCCACNAGGGTTCACTTGTCTCTTACTTTTAACCAGTGAAA 
TTGACCTGCC 

Sequence ID ^-§- SEQ ID NO: 270 nt : 

591 

GTATAGAAAATAATGTCCCCAGNGCATAGAAAAAATGAGTCTCTGGGCCAGTGAAT 
ACAAAACATCATGTCGAGAATCATTGGAAGATATACAGAGTTCGTATTTCAGCTTT 
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GTTTATCCTTCCTGTTAAGAGCCTCTGAGTTTTTAGTTTTAAAAGGATGAAAAGCT 
TATGCAACATGCTCAGCAGGAGCTTCATCAACGATATATGTCAGATCTAAAGGTAT 
ATTTTCATTCTGTAATTATGTTACATAAAAGCAATGTAAATCAGAATAAATATGTT 
AGAC C AGAAT AAAAT T AAT TAT AT TCTGGTCTT C AAAGGAC AC AC AGAAC AG AT AT 
C AGC AGAAT C AC T T AAT AC T T C AT AGAAC AAAAAT C AC T C AAAAC C T GT T T AT AAC 
C AAAGAAT T CAT G AAAAAG AAAGC C T T T GCC AT T T GT C T T AGAAAGT T AT T T T T T A 
AAAAAAAATCATACTTACTATTAGTATCTATGGAAGTATATGTAACAATTTTTATG 
TAAAGGTCATCTTTCTGTGATAGTGAAAAAATATGTCTTTACTAAGTTGAAATGAA 
TACTTTCTGNCTTTGCTAATGGATAGTTATT 

Sequence ID 676 SEQ ID NO: 271 

C T C A A T T C T AC T AAAAAG C C C C C C A AG A A A AG C G A A T GAG A A A AC AG AG T C A T C C T 
CTGCACAGCAAGTAGCAGTGTCACGCCTTAGCGCTTCCAGCTCCAGCTCAGATTCC 
AGCTCCTCCTCTTCCTCGTCGTCGTCTTCAGACACCAGTGATTCAGACTCAGGCTA 
AGGGGTCAGGCCAGATGGGGCAGGAAGGCTNCGCAGGACCGGACCCCTAGACCACC 
CTGCCCCACCTGCCCCTTCCCCCTTTGCTGTGACACTTCTTCATCTCACCCCCCCC 
TGCCCCCCTCTAGGAGAGCTGGCTCTGCAGTGGGGGAGGGATGCAGGGA 

Sequence ID 679 SEQ ID NO: 272 

GNANCNTTTCCTNTCGNAAANCGCGCCTTGTGTTGGTACCCGGGAATTCGCGGCCG 
C G T C G AC AAAAAAAAAAA AAAAAAAAAA AAAAAAA AN TNT AGAC T C G AN C AAG C T T 
ATGCANGCNTGCGGCCGCAATTCGAGCTCGGCCGACTTGGCCAATTCGCCCTATAG 
NGAGTCGTATTACAATTCACTGGCCGTCGTTTTACAACGTCGNGACTGGGAAAACC 
CTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGT 
AATANCGAANAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGG 
CGAANGGAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTA 
AAT C AGC TC AT TTTTTAACCAATAGGCCGAAATCGGC AAAAT CCCTTATAAATCAA 
AAGAATAGACCGAGATAGGGTTGAGNGTTGTTCCAGTTTGGAACAANAGTCCACTN 
T T AAAG AAC GN G G AC T C C AAC G T C AAAG G G C G AAA AAC C G T C T A T C AG GGCGATGG 
CCCACTACGTGAACCATCNCCCTAATCAAGTTTTTTGGGGTCGAGGNGCCGTAAAG 
C AC TAAATCGGAACCCTAAAGGGAGCCCCCGATTT AAAGC TTGACGGGGAAAGCCC 
GGCGAACGTGGCGAAA 

Sequence ID 682 SEQ ID NO: 273 

CACCTGCAGTCCAAGTACATCGGCACGGGCCACGCCGACACCACCAAGTGGGAGTG 
GCTGGTGAACCAACACCGCGACTCGTACTGCTCCTACATGGGCCACTTCGACCTTC 
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TCAACTACTTCGCCATTGCGGAGAATGAGAGCAAAGCGCGAGTCCGCTTCAACTTG 
ATGGAAAAGATGCTTCAGCCTTGTGGACCGCCAGCCGACAAGCCCGAGGAAAACTG 
A AAC T T T G C T T AAC N AC C G AA T G GN G G G G AN C T T T T C C AAC GN T T T T 

Sequence ID 683 SEQ ID NO: 274 

TTGGTTTCATACTGNTGGGGNTTGAATGNTCCCTNCAACACTNATGTTGANACTTA 
ATCCCTAATGNGGCAATACTGAAAGGTGGGGCCTTTGAGATGTGATTGGATCGTAA 
GGCTGTGCCTTCATTCATGGGTTAATGGATTAATGGGTTATCACAGGAATGGGACT 
GGTGGCTTTATAAGAAGAGGAAAAGAGAACTGAGCTTGCATGCCC 

Sequence ID &^4 SEQ ID NO: 275 nt : 

545 

GTGGAAGNGACATCGTCTTTAAACCCTGCGTGGCAATCCCTGACGCACCGCCGTGA 
TGCCCANGGAAGACAGGGCGACCTGGAAGTCCAACTACTTCCTTAAGATCATCCAA 
CTATTGGATGATTATCCGAAATGTTTCATTGTGGGAGCAGACAATGTGGGCTCCAA 
GCAGATGCAGCAGATCCGCATGTCCCTTCNCGGGAAGGCTGTGGTGCTGATGGGCA 
AGAACACCATGATGCGCAAGGCCATCCGAGGGCACCTGGAAAACAACCCAGCTCTG 
GAGAAACTGCTGCCTCATATCCGGGGGAATGTGGGCTTTGTGTTCACCAAGGAGGA 
CCTCACTGANATCAGGGACATGTTGCTGGCCAATAAGGTGCCAGCTGCTGCCCGTG 
CTGGTGCCATTGCCCCATGTGAAGTCACTGTGCCAGCCCAGAACACTGGTCTCGGG 
CCCGATAAGACCTCCTTTTTCCAGGCTTTAGGTATCACCACTAAAATCTCCAGGGG 
C AC C AT T GAAAT C C T GAG T GAT G T GC AC T GAT C AAG AC T GG 

Sequence ID 685 SEQ ID NO: 276 

GGAAAGGGCCATTTTATTGCCTAAAACCACCTGGNTTTTNAGGTAACAGTTCCAAC 
ATGTCCTTTTTTGAATAGCTGTTCTAATTATTATATATTCAGCTGATTAATAGGAG 
TACTTGATAGGTGGACTGTGTCAGGTAGCCTCAGGCAATCCTACTTCAACAAGCTG 
TCAGGGAGCCATGCCATGCTTCTTTATGACATAGGTGAATTTGATAGGCTCACTAG 
CAGAACATGGGATCACAAGGTGGAACCNTTCCNTTT 

Sequence ID 686 SEQ ID NO: 277 

GACCCCTTCCTTACACCTTATACAAAAAAACTGAAACTGGACCCCTTCCTTACACC 
TTATACAAAAATTAACTCAATTTTATTATGTTGTATTAAATTAAGTTGGGTTTAAT 
TAAGATGGATTAAAGACTTAATTATAAGACCTAAAACCATAAAAACCCTAGAAGAA 
AACC T AGGCC AT ACC AT T C AGGAC ACGGGT AT GGGC AAAGAC T T C AT AAC T AAAAC 
ACCAAAAGCAATGGCAACGAAGTCCAAATAGACAAATTGGACCTGATTAAACTAAA 
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GAGCTTCAGCACAGCAGAAGAGACTATCGTCAGAGTGAACAGGCAACCCACAGAAT 
GGAAGAAAATTCTTGCAATCTATCCATCTGACAAGGGGCTAATATCCAAAATCTAC 
AAAGAACTTAAACAAATTTACAAGGAAAAACACAAACAACCCCATCAAAAAGTGGG 
C T AAGGAT GT GAAC AGAC AC T T C T C AAAAGAAAAC AT T T AT GC AGCC AAC AAAC AT 
G AAAAAAAG T T CAT CAT C AC T GC T C AT T AG AG AC AT GC AAAT CAAAAC C AC AAT G A 
GATCCCATCCCACACCAGTTAGAATGGCAATCATTAAAAATGT 

Sequence ID 6^7- SEQ ID NO: 278 nt : 

268 

TTTATGTGTTTTTGCTTGGGGGGCGCTGGGCCTAGCCCAGAGTAGTGCTTGCTCCC 
CCTGCCTTGTCCCACCAGGGAGGCAGCAGACTCAGGCCCTCCATGGTCCTCTTTGT 
CATTTTGTTGACATGCATTCCTCCTTTTGTCATCTTGTTGGGGGGAGGGGATTAAC 
C AAAGGC C AC C C T G AC TTTGTTTTTGTG G AC AC AC AAT AAAAGC C CCGTTTATTTG 
TAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

Sequence ID &g-8- SEQ ID NO: 279 nt : 

569 

CTTT AGCC AGCC T GAT CAGAAAAAAACAAAAGAAGAGGAAAGACGTAGATTACCAA 
CAT C AAGAAT G T GAG T TAT GAT AT C AC T AC AGAC T C T C C AGG T AT T AAAAGC AT AA 
T T AG AGAAT GAT AT GAGC AGC T AT AT GC AAAT AAG T T C AAC AT T GGAC AAAT GGAC 
AAAT T T C T T G AAAG AT AAAT T AT G AAAT TTCATTCT G AAAG AAC T AC AT G AC C T T A 
ATTGTCTTACATCTATTAAATAAGTGGAAATTGTAGTTTAGAAACTTTCCCACAAA 
GAAAACTCTAGGCCCAGATGGCATCAAAATAATATTCAGATGAATGAAATGGAGAA 
AGGATAGCCTTTTCAACAAATGGTGGTGGAACAATTGGATTTCCATATGCAAAAAA 
ATAGAGATGGACGCAGAGGTGTGTGCTTAGGAGGCTGAGGTGAGAGGATTGTTTGA 
GGCC AGCC TGGGCAACAT AGC AAGACCCCATTTCAAAAACAAAAATAAAGAACTTG 
TAGCCTTACCTTGTGCCATATTATGAAAATGTATCATAGGCTTAAATGTGAAACGT 
AAAACAAAA 

Sequence ID 689 SEQ ID NO: 280 

CGCAGGGGCTTCTGCTGAGGGGGCAGGCGGAGCTTGAGGAAACCGCAGATAAGTTT 
TTTTCTCTTTGAAAGATAGAGATTAATACAACTACTTAAAAAATATAGTCAATAGG 
TTACTAAGATATTGCTTAGCGTTAAGTTTTTAACGTAATTTTAATAGCTTAAGATT 
TTAAGAGAAAATATGAAGACTTAGAAGAGTAGCATGAGGAAGGAAAAGATAAAAGG 
T T T C T AAAAC AT GAC GGAGGT T GAG AT G AAGC T T C T T C AT GGAGT AAAAAAT GT AT 
TTAAAAGAAAATTGAGAGAAAGGACTACAGAGCCCCGAATTAATACTAATAGAAGG 
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GCAATGCTTTTAGATTAAAATGAAGGTGACTTAAACAGCTTAAAGTTTAGTTTAAA 
AGTTGTAGGTGATTAAAATAATTTGAAGGCGATCTTTTAAAAAGAGATTAAACCGA 
AGGT GAT T AAAAGAC C T T GAAAT C CAT G ACGC AGGGAG AAT T GC 

Sequence ID 69Q SEQ ID NO: 281 

CGAAAAGCAAATATAACTTGCCACTAACCAAGATCACCTCTGCAAAAAGAAATGAA 
AACAACTTTTGGCAGGATTCTGTTTCATCTGACAGAATTCAGAAGCAGGAAAAAAA 
GCCTTTTAAAAATACCGAGAACATTAAAAATTCGCATTTGAAGAAATCAGCATTTC 
TAACTGAAGTGAGCCAAAAGGAAAATTATGCTGGGGCAAAGTTTAGTGATCCACCT 
TCTCCTAGTGTTCTTCCAAAGCCTCCTAGTCACTGGATGGGAAGCACTGTTGAAAA 
T T C C AAC C AAAAC AGGGAG C T GAT GGC AGT AC AC T T AAAAAC GC T CC T C AAAGT T C 
AAAC T TAG AT T T C AG AT T T 

Sequence ID 691 SEQ ID NO: 282 

CCGGTCTCTACACAATATATAGAAATCTGGGCATGGTGGTGCCTGGCTGTAGTCTC 
AGCTACCTAGTTGGGTGAGGTGGGAGAGTCGCTTGAGTCCTGGAGGTTGAGGCTGT 
AGTGAGCCAGGGCTGCACCACTGCATTCCAGCCTGGGTAACAGAGTGAGACCCTGT 
C T C AAAAAG AAAAAAAAAAAT T GC T AAT T T T AAC AAAT C AC AAAAC T G AC T CAGGC 
AAG T T G T C T G AC T C AAAAG C C C T T G AAAAAC CAT C AAAG AC AG T AG AAT G T T AAC T 
GGTCATTTACGTAAAATAGTGTTCATTAAATTTTTGGTTCATTTAGGATAATCATT 
T T AAAT GAG AC T G T AT T T GAG AC T G TAT AC AC AT AC AT AT AC AT G T T T AC AC AC AT 
ATACGTACAATATATGTACATTCTATCTAAAAGATCATACATGTGTGTACATATAT 
GTTTTTAAAAGTCAAACTGACATATTAATGGAAACAGTGCTTACATCTCTGGTAGT 
GATTTTCTATTAGCAGCAGCCCTACATATGCTGCGTCTCTGAACAGCATGTCAGTG 
C C AT G AC T G T C T AAAC AT GC AAAT AT G AC T G AC AG AC T C T T G AG AC AGC T T T C AC C 
TTG 

Sequence ID 692 SEQ ID NO: 283 

AATTCGNGGCCGCGTCNNCCTANGAGGCACCAGGAAATCCCGCGGGGTGGCCCATG 
CAGACCAGGCGCACGTGGCTCATGGGGCANAATTGCCAAGGACAGCTCACGACAGT 
GCCACCTTCTCACCATTCCAGCCAAGGAGAGATGTGACGTTGGAACTGCTCTGGCA 
CTTCTGTCAAGCCTCCCCCGCCCCAATTGCCTTGAGATCTCTGCTCTTTGTCAGAG 
ATTTGCAAAGACTCACGTTTTTGTTGTTTTCTCATCATTCCATTGTGATACTAAGA 
AAC T AAGAAGC T T AAT G AAAAG AAAT AAAAT GCCTATGTTGTTGTT C T 



Sequence ID 693 SEQ ID NO: 284 
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CTAGAACCCATGACTCCTAGGTCTTATACTGCAACCACAGTATCAGCAAATAATCT 
TTCATAAGGGGATTATTCTCTGATTAACAGGAAATACAGGAATTTAATTTGTGAAC 
ACGCTAGGTAGAAGCAGAAACCCAAATCCAAATCCAAATTTAAACATTTAAAATTC 
ATT C T AT AAC T AAGAT C T AAC AGT C AT TTTCTTCC C AG T AAG AAAT AAC C AAAGC A 
TGCTAAAAATCACTGGACTAAATTGGTGTCAAAACTGCCACATTGCCAGGCATGGG 
GGGGTCATACTTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGAAAATTGCTTGA 
GGCCAGGAGTTCGAAACCAGCCTGGGCAACACAGTGAGACCCCATCTCCACAAAAA 
AAAAAAATTAAAAAACAAAACAAAACATTAGCTGGGCATGGTGGTACACGCCTGTA 
GTCCCAGCTACTCAGGAGCCTGAAGTGAGAGGATCACTGAAGCCCAGGAGGTAGAG 
CTATGACTGTAGTGAGCTATGACTGTGCCACTACACTCCACCTGGGTGACAGGGGA 
CTC 

Sequence ID 694 SEQ ID NO: 285 

CGACTTCCATTTGTATTAATGGAATACTAAGTCCCTCTGTGATTTCTGAACCAAGC 
TATTCCTAGGCCTGAGTTTTATTTTGTTGACACAGAAATAAATTANAAGGCCAAGC 
GTGGTGGCATGTGCCTGTAGTCCTAGTTGCTGAGGTAAGAGGATTGCTTGAGCCCA 
GGAGTTCAAGGCTGCAGCAAGCTTTGATTGCGCCACTGCACTCCAGCCTTGGCGAC 
AG AC T AAG AC G C T G T C T C A A A A A A A A A C A A A A A 

Sequence ID 696 SEQ ID NO: 286 

GGTTATCAATGAGATTAAGAGACAACTAGAGTAAAAACAAAAGAAAAGAAAAGAAA 
NGAAAACAACAGAAGCTCTATTAACTGACCTCTAACCAATACAACAGGTTAACTGA 
TGTTCTCCATTCTGTATATAAAAATCCCAGTGGACACCCACAACACAGGCTTCAGG 
CTTGTAGGACACTTTCTAGTTCATCTGAGCACTTTTGTTCTCAGCAGTTGAGCTGT 
ATACTTAGCAACATTTGGTGCTTCCAAACCCATTTGTGCCTGTAGCACTTACTATT 
GAAATACATAATTTAATTAAATATTATATAAAGGAATGGAATACGAGTTGGACAAG 
AAAAAG AG T T AAAT C T GAAGGT T AGG T AAAAAG AGC AAC TTCTTTTCTCTGTTTTG 
C AGGT T GGC AAAAT CAT T T AAAAAC AAT T GGAAGT AT T AT AT GT T C T GC AT T AAGT 
TGTCATTTTACTTAAAAACTAGGCATCAAAGATGATGCATAATAAATTTAGTGTAT 
GCAAGAATGACTGCTTGGGACCTCAATATATGAATTCTTAATCCAAGGAAAGTCCT 
TGGCCTTACATTTAAAAGTCGGCAAATAAGTGTACGTTCATT 

Sequence ID 697 SEQ ID NO: 287 

GAACATTTAAAAATAATGCAAATAAGGCTGGGCGTGGGGGCTCACACCTGTAATCC 
CAGCACTTTGGGAGGCCGAGGCAGGCAGATCACGAGGTCAGGAGATTGAGACCATC 
CTGGCTAACACAGTGAAACCCTGTCTCTACTTAAAAAATAAAAAAATTAGCCAGGC 
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GTGGTGGTGGGCGCCTGTAGTCCCAGCTACTCAGGAGGCTGAGGCAGGAGAATGGT 
GTGAACCCGGGAGGCGGAGCTTGCANTGAGCTGAGATCGTGCCACTGCACTCCAGC 
C T GAG C G AC AG AG C GAG AC TCTGTCT 

Sequence ID 698 SEQ ID NO: 288 

TCATTAGAATCCAAGCTTTGAAAATTTCTGATTAATGCTCATGTATTTCTTTATCT 
TTGTTTTTCCTTGTGAAGAAAGACTTTCACCACTGTCTGAGTGATGATGCTGTTGA 
TAAGGATGATGTCGATGACTACTATATTGCATCTCTCAGGAACAGCTGATGGGAAG 
GGAGGGGCTGCTGAGTTCCCTTGTTCTAGCTAGCAGCACGCTCCTCANAGAGGGGG 
CCGAGTTACAGACAGCAGCCGCATTCTCATGCAAAATTAGTTTTAAACTGCTAGTG 
TGGGCATCGGTACCTTTTGCCTGGGTGATACCGAAGAATTGTTGAGGATTTAGTAT 
GC T C CGT AGAG AC AG T T C AGC C AGT C AT T T C T GC AT T GGAGAGAC T T C T CAT AC T T 
T C T T T G AAG AC T C AT AG AAAG C T G G AT 

Sequence ID 699 SEQ ID NO: 289 

ATTAAGGTTTGTNCCCAACAAGAATAGATGTAATTAGAAAAAANTGNCTTCCTTAC 
CTATTGCCTCTGATNTTTACTTGCTTAAATTTTTTTTATTGNAAATCCAGAAAAAG 
NGGATTTAGAGAACAACACTAACTCCCACCTAATCTATGACAGANATGTACAANAN 
AGTACCTGT G A A A A A T G T G A A AGN A T N T G A A A A A T G T A AC CTTTGGCAGCCTGAGC 
A T AG T C A AC C AG A A A A AC T A T C T G A A T T A A A A T A A TTGGTCCATAGGTACTATTTT 
ATTTGGTCCATAAGGATTATTTTTTCAACTTTTTTTTCAAGTGTATTATTATGTCA 
TTTCCCACGTAGGTTACTGATACCTGAAGACTTTTTNCACCTTTAACCTTNCTCGT 
TGAGGAGCTTTGTANTCTAATAAAAGAGAAATATAAGTAAATGTTAGATATATGGG 
NGGATAATGGTAACTATGTGCTTAAAGAGGTATAAAAGAAGGGTAGGGAGCAGATA 
AGACAAAGGAAGGGCTATATTATAANGAAGAATATTCCAAGTAGGGAAGAGAAAAA 
GATATGTTATCCATATAATATTTTATGTGCAGTAGAGAACATGTTCTATAGAANAG 
AC AG AAG AT G 

Sequence ID 7QQ SEQ ID NO: 290 

CTTGAGCCCAGGAATTCCAGCCTGGGCAATATAGTAAGACTCCGTCTCTACAAAAG 
ATACAAAAATTAGCCAGATGTGGTGGTGCGTGCCTGTAGTTCCAGATACTGGAAAG 
ACTGAGGCAGGAGGATTGCTTGAGCATGGGAAGTTGAGGCTGCAATGAGCTGTGAT 
TACGCCACTACACTCCAGCCTGGGCAACAGAGTAAGATCTTGTCTCAAAAAAAAAA 
T T GAAT T C A G C T A A A A A T A A T A A A A T T T TAAAATAAT T T T AAAAAGC C C T C AAC AG 
CTTTGTTTTTCTCTCCTTGCCAGCTTCTCTGCAGCCTATAGCCTGCAGGCTGGCTG 
CTGCGAGCCAGGACAAGCGGTGGGAAATGCAATCACAGCGTGAAATCTCTGTGTTC 
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AGAGACACGCAGGAAGCAGGTGAACCATGAAGGGCCAACACATGCCCCCAGTTAGC 
AGGGTGTAGAGACCGGGGCAGGGCTTTCTTCTTCCTTCTGGGTTATAAATATCCAT 
GTCCTGCCATTTGAAGCTGCAAGTGGCACACATGGATGCTGGACAGGCGCTCGCAC 
TTTCTGGGCAGGGCANGGGGCTCAAAGGCAGGACAGCTGGGCAAAAGCACCTTGCG 
TGGGCCC 

Sequence ID ■ — 7-&j rSEQ ID NO: 291 nt : 

579 

CTTTGGAGCTTCTGTCTGTGCTGTGGACCTCAATGCAGATGGCTTCTCAGATCTGC 
TCGTGGGAGCACCCATGCAGAGCACCATCAGAGAGGAAGGAAGAGTGTTTGTGTAC 
ATCAACTCTGGCTCGGGAGCAGTAATGAATGCAATGGAAACAAACCTCGTTGGAAG 
TGACAAATATGCTGCAAGATTTGGGGAATCTATAGTTAATCTTGGCGACATTGACA 
ATGATGGCTTTGAAGGTAATTAAAATTATCAAATTGGTGCTTGATTTCTGCTTTTA 
AAATGGTTTATGGAAGAAAATATGATTAAAGTTTTGTATTGTTTTCCTTCCTATAG 
AAGATGGAGCCAGAATGGCATGCTAAGTTTTTTCTTTTCTTTAGTGTTATATATGA 
CTTCTCCTCAATTGTCACCCATTGATCTTTACCACTGTTAATAATGGATGATATTC 
AAAATACCT TATTT C AGT GAT T C T AAGGC AC CAT T GAT T AGAAAC T GC AT TAT TAT 
TTATGTGTCCCTAAAAGCTACCTATTAAGCTGTTACACCCACCATTTTTCTGTTAA 
GAAAAT C C T G AT T T C AG AA 

Sequence ID 702 SEQ ID NO: 292 

GTNNTCCTCTCGGAACGCGCCTTNTGTAGCCAGGTGCTACCAGACCNAATACACGG 
TTGTTCCAGCTTGCGCATTCACCGATGGCGTAGATATCCGGATCGGAAGTCTGGCA 
GGAATCATTAATGACAATACCCCCACGCGGAGCAACGTCCAGACCACACTGGGTTG 
CCAGCTTATCGCGCGGACGGATACCGGTAGAGAAGACGATAAAGTCGACTTCCAGT 
TCGCTGCCGTCGGCAAAACGCATGGTTTTACGCGCTTCAACACCTTCCTGCACAAT 
CTCAAGGGTGTTTTTGCTGGTGTGAACGCGCACGCCCATACTTTCGATTTTGCGAC 
GCAGCTGCTCGCCGCCCATCTGATCAAGCTGTTCTGCCATCAGCATAGGGGCAAAT 
TCGATAAC 

GTGGGTTTCAATACCTAAGTTTTTCAGCGCGCCTGCGGCTTCCAGACCTAACAGGC 
CGCAATTCGAGCTCGGCCGACTTGGCCAATTCGCCCTATAGTGAGTCGTATTACAA 
TTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAAC 
TTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAAGAGGC 
CCGCACCCGATCGCCCTTTCCAACAGTTGCGCACCTGAATGGCGAATGGAAATTGT 
AAGCGTTAATATTTTGTTAAAATTCGCGT 
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Sequence ID 703 SEQ ID NO: 293 

CTGCGCAGACCAGACTTCGCTCGTACTCGTGCGCCTCGCTTCGCTTTTCCTCCGCA 
AC CAT G T C T G AC AAAC C C G AT AT GG C T GAG AT C G AG AAAT T C G AT AAG T C G AAAC T 
G AAG AAG AC AG AG AC GC AAG AG AAAAAT C C AC T GC C T T C C AAAG AAAC GAT T G AAC 
AGGAGAAGCAAGCAGGCGAATCGTAATGAGGCGTGCGCCGCCAATATGCACTGTAC 
ATTCCACAAGCATTGCCTTCTTATTTTACTTCTTTTAGCTGTTTAACTTTGTAAGA 
TGCAAAGAGGTTGGATCAAGTTTAAATGACTGTGCTGCCCCTTTCACATCAAAGAA 
CTACTGACAACGAAGGCCGCGCCTGCCTTTCCCATCTGTCTATCTATCTGGCTGGC 
AG G G A AG G A A AG A AC TTGCATGTTGGT G A AG G A AG A AG T G G G G T G G A AG A AG T G G G 
GGTGGGACGACAGTGAAATCTAA 

Sequence ID 7 04 SEQ ID NO: 294 

CTTGTATTCAAGAACTACTGTAATGCATTAGTGGTCTGGCTTCATTTTGTATGATG 
CCAGATCCTTAATTTACCCAGCACAATCATTTCAGTAGTTTCCTATGGCTCCTGCA 
AAAATGCAAACAGAAACCACCACAGGAACAGCCCCTTGCTGCCTCCTGTTGCTGAG 
GTAGTAGTCGCTAAAGAAAATTGAAGGCTCCTTACAATCTATATTTGAAAACTAGA 
ACTTCTGTAGAAACACACAGATCCCGATCTTAGAAGTTGTACAGGACAATCTGGTA 
AAACTGACATAATTGTGATTTATTAACATGAATTAAAATGCCCAACCAGTGCTTCA 
GTGTGACAGTATATTTAAAATAAAAAAGAAATTAAAGGTCATATACTGTACTACTT 
T C AC AAAGAT C C AC AGT T T T GC AAAAGAC T T G T CAT AT G T AC AAT GC T AT AT AT C A 
AATGAGAAAAGCTGTAAGCAATTATATACGCAAAAGAAATGGCAGTA 

Sequence ID 705 SEQ ID NO: 295 

TTCCAGTCCTTTCATTTAGTATAAAAGAAATACTGAACAAGCCAGTGGGATGGAAT 
T G AAAG AAC T AAT CAT G AGG AC TCTGTCCT G AC AC AGG T C C T C AAAGC T AGC AG AG 
ATACGCAGACATTGTGGCATCTGGGTAGAAGAATACTGTATTGTGTGTGCAGTGCA 
CAGTGTGTGGTGTGTGCACACTCATTCCTTCTGCTCTTGGGCACAGGCAGTGGGTG 
TAGAGGTAACCAGTAGCTTTGAGAAGCTACATGTAGCTCACCAGTGGTTTTCTCTA 
AGGAATCACAAAGGTAAACTACCCAACCACATGCCACGTAATATTTCAGCCATTCA 
GAGGAAACTGTTTTCTCTTTATTTGCTTATATGTTAATATGGTTTTTAAATTGGTA 
ACTTTTATATAGTATGGTAACAGTATGTTAATACACACATACATATGCACACATGC 
TTTGGGTCCTTCCATAATACTTTTATATTTGTAAATCAATGTTTTTGGAGCAATCC 
CAAGTT T AAGGGAAAT AT T T T T GT AAA 



Sequence ID — 706 SEQ ID NO: 296 
496 



nt : 
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CAACCCTCTCTCCTCAGCGCTTCTTCTTTCTTGGTTTGATCCTGACTGCTGTCATG 
GCGTGCCCTCTGGAGAAGGCCCTGGATGTGATGGTGTCCACCTTCCACAAGTACTC 
G G G C AAAG AG G G T G AC AAG T T C AAG C T C AAC AAG T C AG AAC T AAAG GAG C T G C T G A 
CCCGGGAGCTGCCCAGCTTCTTGGGGAAAAGGACAGATGAAGCTGCTTTCCAGAAG 
C T GAT GAGC AAC T T GGAC AGC AAC AGGG AC AACGAGGT GGAC T T C C AAGAGT AC T G 
TGTCTTCCTGTCCTGCATCGCCATGATGTGTAACGAATTCTTTGAAGGCTTCCCAG 
ATAAGCAGCCCAGGAAGAAATGAAAACTCCTCTGATGTGGTTGGGGGGTCTGCCAG 
CTGGGGCCCTCCCTGTCGCCAGTGGGCACTTTTTTTTTTCCACCCTGGCTCCTTCA 
ACACGTGCTTGATGCTGAGCAAAGTTCAATAAAGATTTTGGGAAGTTT 

Sequence ID W^ SEQ ID NO: 297 nt : 

397 

CGGATGTGGTGGCAGGCGCCTCTAGTCCCAGCTACTCGGCAGGCTGAGGTAGGAGA 
ATGGCTTGAACCCAGGAGGTGGAGCTGACAGTGAGCCGAGATCGCGCCACTGCACT 
C C AGC C T GGGC GGC AG AGC G AG AC T C C AT C T C AAAAAAAAAAAAAAAAAAAAT AG A 
CTTTGAGACCAGCCTGACCAACATAGTGAAACCCGTCACTACTAAAAATACAAAAA 
TTACCCGGGCGTGGTGACGGGCGCCTGTAATCCCAGCTACTTGGGAGGCTGAGACA 
GGAGAATCACTTGAACCAGGGAGGCGGAGGTTGTAGTGAACTGAAATCGTGCCCCT 
GC AC T CC AGC C T GGG T AAC AAGAGC GAAAC T C CGT C T C AAAAAT AAAT AAAT AAAT 
AAAAT 

Sequence ID 7-Q-8- SEQ ID NO: 298 nt : 

293 

CCAGCTTTTTATGGTGTTTAATCTAATACACTTAAGCTGCAGTCCCAAAATTAGGG 
GTCCTTCAGTCTTGGAGACTATAAGGGAGCCTCTGCACCCAGGGAAAATGTTACCC 
TTTACAGGGGGGAAGGGTAAACCAGTAGGGAATACAGTACAATCCCAACCCTACTG 
GGAGGGGCGGGAGGGAGGTGTTGCCGTCACTGTATTAAGTCGATGTTGGGAAACGT 
TTTAACATCTGGAGCCTTTGTGGGTGGAAATATGTCTCCAGTTACAACTCCGCAGT 
G GAT G T G AAG AAG 

Sequence ID 709 SEQ ID NO: 299 

GGAAGCTACAATGATTTTGGGAATTACAACAATCAGTCTTCAAATTTTGGACCCAT 
GAAGGGAGGAAATTTTGGAGGCAGAAGCTCTGGCCCCTATGGCGGTGGAGGCCAAT 
ACTTTGCAAAACCACGAAACCAAGGTGGCTATGGCGGTTCCAGCAGCAGCAGTAGC 
TATGGCAGTGGCAGAAGATTTTAATTAGGAAACAAAGCTTANCAGGAGAGGAGAGC 
CAGAGAAGTGACAGGGAAGCTACAGGTTACAACAGATTTGTGAACTCAGCCAAGCA 
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CAGTGGTGGCAGGGCCTAGCTGCTACAAAGAAGACATGTTTTAGACAAATACTCAT 
GTGTATGGGCAAAAAACTCGAGGACTGTATTTGTGACTAATTGTATAACAGGTTAT 
TTTAGTTTCTGTTCTGTGGAAAGTGTAAAGCATTCCAACAAAGGGGTTTTAATGTA 
NATT 

Sequence ID 710 SEQ ID NO: 300 

TGGATTCCCGTCGTAACTTAAAGGGAAACTTTCACAATGTCCGGAGCCCTTGATGT 
CCTGCAAATGAAGGAGGAGGATGTCCTTAAGTTCCTTGCAGCAGGAACCCACTTAG 
GTGGCACCAATCTTGACTTCCAGATGGAACAGTACATCTATAAAAGGAAAAGTGAT 
GGCATCTATATCATAAATCTCAAGAGGACCTGGGAGAAGCTTCTGCTGGCAGCTCG 
TGCAATTGTTGCCATTGAAAACCCTGCTGATGTCAGTGTTATATCCTCCAGGAATA 
CTGGCCAGAGGGCTGTGCTGAAGTTTGCTGCTGCCACTGGAGCCACTCCAATTGCT 
GGCCGCTTCACTCCTGGAACCTTCACTAACCAGATCCAGGCAGCCTTCCGGGAGCC 
ACGGCTTCTTGTGGTTACTGACCCCAGGGCTGACCACCAGCCTCTCACGGAGGCAT 
CTTATGTTAACCTACCTACCATTGCCCTGTGT 

Sequence ID ^r SEQ ID NO: 301 nt : 

498 

G T GG T AC AT AT AC AC AAAGGAAAAC T AT GT AGCC AT T AAAAGAAAAGG AAC T CC T A 
T C AT T T GT AAC AAC AT AAAT AAAT C T GG AGGAGAT T AGGC T AAGG T GAAAT AAGC C 
AGGC AC AAAAAGAC AAC T AC C AT AT GAT C T T AC T T AT ACGT G T GT GGAAT C T AAAA 
AGGTGGAATTTACAGAAGCAGAGAGTAGAATGGTGATTACCAGAGGCTGGGGAGTG 
AGGGCAGGAGGTTGGAGAAATGTTGGTCAAAGGATACAAAGTTTCAGTTATACAGG 
ATGAATAAGTTCAAGAGATCTATTGTACAACGTGGTGGCTATAGTTGATAACAATG 
TATTGTGTTCTTGAAAAATGCTGAGAGAGTAGATTTTAAGTGTTCTCACCACAAAA 
CATAAGTATGTGAGGTAATGCATGTGTTAATTANCTTAATTTAGACATTTCATAAT 
GTATTATACATATTTCAAAACCACGTTGTACATGAGAAAGATACACAATT 

Sequence ID 713 SEQ ID NO: 302 

GCCCAGTCGACCCATGTTCTCCTTTCTACACCAGCATTAGACGCTGTCTTCACAGA 
TTTGGAAATCCTGGCTGCCATTTTTGCAGCTGCCATCCATGACGTTGATCATCCTG 
GAGTCTCCAATCAGTTTCTCATCAACACAAATTCAGAACTTGCTTTGATGTATAAT 
GATGAATCTGTGTTGGAAAATCATCACCTTGCTGTGGGTTTCAAACTGCTGCAAGA 
AGAACACTGTGACATCTTCATGAATCTCACCAAGAAGCAGCGTCAGACACTCAGGA 
AGATGGTTATTGACATGGTGTTAGCAACTGATATGTCTAAACATATGAGCCTGCTG 
GCAGACCTGAAGACAATGGTAGAAACGAAGAAAGTTACAAGTTCAGGCGTTCTTCT 
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CCTAGACAACTATACCCGATCGCATTCAGGTCCTTCGCAACATGGTCACTGTGCAG 
ACCTGAGCAACCCCACCAAGTCCTTG 

Sequence ID 714 SEQ ID NO: 303 

C T GT AAC AGAG AT TCCTTTTTT C AAT AAT C T T AAT T C AAAAGC AT TAT T AGAC T T G 
AAAGGGTTTGATAATCTCCCAGTCCTTAGTAAAGATTGAGAGAGGCTGGAGCAGTT 
TTCAGTTTTAAATGAGTCTGCAGTTAATATCAAATGTGAGTTTGGGACTGCCTGGC 
AACATTTATATTTCTTATTCAGAACCCTTGATGAGACTATTTTTAAACATACTAGT 
CTGCTGATAGAAAGCACTATACATCCTATTGTTTCTTTCTTTCCAAAATCAGCCTT 
CTGTCTGTAACAAAAATGTACTTTATAGAGATGGAGGAAAAGGTCTAATACTACAT 
AGCCTTAAGTGTTTCTGTCATTGTTCAAGTGTATTTTCTGTAACAGAAACATATTT 
GGAATGTTTTTCTTTTCCCCTTATAAATTGTAATTCCTGAAATACTGCTGCTTTAA 
AAAGT CCC AC T GT C AGAT TAT AT TAT C T AAC AAT T GAAT AT T GNAAAT AT AC T T GG 
CTTACCTCTCAATAAAAGGGTCTTTTCTATT 

Sequence ID 717 SEQ ID NO: 304 

T CCACCCACCTTGACCT CCC AAAGT GCTGGGATT AT AGGCGTGAGCCACCTCGCCC 
AGCCCGATACTAGGACTTATGCAGAAAAAACCTTGACATGGAGGAAAGTAAGATCT 
AAATAAATACTGTATTCATAGATTAAAAGACTCAGCATAATAAATATACCATTTCT 
CCCCAGATTGATGTACAGATTTAACACAATTCCTATCAAGATCCCAGCAAGATTTT 
T GT AGAT AT GT AAAAGAT T AT T C AAAAAT GT AAAAGGAAGGAC AAAGGAC T AGAAT 
AGAT AAAAC AAAAT GGAGAAAGAT T T AAT AGGAAT C AC T GT AAC T GAT T T TAAGAC 
ATACAGAACAATAATAGAAACTGCTTGTATTAGTCCATTTTCACGCTGCTGATAAA 
GACATACCTGAGATTGGCAATTACAAAGGAAAGANGTTTATTGGCTTACAGTTCCC 
ATGGCTGGGGAGGCCT 

Sequence ID 718 SEQ ID NO: 305 

CTCCTCTGGGTTGAAACCCGGGCGCCGCCAAGATGCCGGCTTACCACTCTTCTCTC 
AT GG AT C C T GAT AC C AAAC T CAT C GGAAAC AT GGC AC T GT T GCC T AT C AGAAGT C A 
ATTCAAAGGACCTGCCCCCAGAGAGACAAAAGATACAGATATTGTGGATGAAGCCA 
TCTATTACTTCAAGGCCAATGTCTTCTTCAAAAACTATGAAATTAAGAATGAAGCT 
GATAGGACCTTGATATATATAACTCTCTACATTTCTGAATGTCTGAAGAAACTGCA 
AAAGTGCAATTCCAAAAGCCAAGGTGAGAAAGAAATGTATACGCTGGGAATCACTA 
ATTTTCCCATTCCTGGAGAGCCTGGTTTTCCACTTAACGCAATTTATGCCAAACCT 
GC AAAC AAAC AGGAAGAT GAAGT GAT GAGAGC C T AT T T AC AAC AGC T AAGGC AAG A 
G AC T G G AC T GAG AC T T T G T G AG AAAAG T T T T C G AC C C T C AG AAT GAT AAAC C C AG C 
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AAGTGGNGGGCTTGCTTTGTGAAGAGACAGTTCATGAACAANAGTCTTTCAGGACC 
TGGACAGTGAAGGGAGCCCGGGCAGCCA 

Sequence ID 719 SEQ ID NO: 306 

CGNGGCCGCGTNAACTTTTGATCGTCAGCTGGGGCTGGCAGGCACCTAAATGGGAA 
GGGTGATAGCAGTGTGTTGGGGGGAGTTTAGGGAACGGTCCTCTACCGATAGAGGC 
AGCANCTCATTGGAATTTCCTCCTGAAGTTGTCTTGCCCCTTGAATCCTGCAGGAA 
GGCTGGCAAATGGCCATTTCCCTTCCACTTGAATAGAGACCCATAACTCAAGTATC 
TGCCCTTAAGACACCACAGGACTGTTCTTCGCGGGCCCTGCCCCTGGATTTGGGAG 
AGGCAGTCCANCTCACCCAACTAGGCTCTGCANGGGGACCANGAGGGATGGGTTGT 
G T C C AC AGGAC C AGCC AG AC T GAT GAGGGAT GCGGC AAGC AT AT T C T C ACC ACC T T 
CTTTCACGTTTACAACANACCAGCNTTCCCTGTGTGGCAGGGGTTACATTGGTCAC 
CGAGGACCTANAATCATGGAGTGCTCTGGGGATCCGGGCTTGGA 

Sequence ID 720 SEQ ID NO: 307 

TCAGTGTTGAATTTTGTCAGACACTTTCTCTGCATCAATTGGTATGACCATGTGAT 
TTTTTTTCTGTAGCCTGTTAATATGGTTAATTTTCAAATATTGAGCTGATTAATTT 
TCAAATATTGAGCTCTCCTTGCATCTCTGGAATAAGTACCACTTGGTCGTGGTATA 
TATTTCTTTTAATATATTGCTGAATTCTGTTTGATCATGTTTTCTTAAAGACTTTC 
GTGTCTGTTTTCATGATAGATACTGGTCTATAGTTTTGTTGTAATATCTTGGTTTG 
ATTTTGATATCAGGATAATGCTACCTTAATAGAATGAATTGGAGCCAAGTATGGTG 
GCAAATGCCTATAGTCCTAGCTACTCAGGAGGCTGAGGTGGTGGGGACTGCTTGAC 
C CANGAGTT CAAAT C T AGC T T GGGC AAT GT AGC AAGAC 

Sequence ID 721 SEQ ID NO: 308 

TAGAAGGAATGACTATTCATGTCCAAAGTGAATGGTTTTGTGCAGTGAACAACACA 
TGGCGAGGTACTAACTGAGAAACTTTTTCATGCTTTATGCCTACCTCTTGTAGTTG 
TTGCAGAGCAAATATAAATTGTAATAAGATAGCTAGGCCTTGCAGAAACAAACAGA 
AAAACTTAAAAAAAAATGATATAAGAGCTGGAGTCTAGTATTTATATGAATCTGTG 
AGAGATAATTTTTTTGGTCTCACTGCAATGAACCAAAAGCGGCTGAGTTTGGTTTT 
TAATTGTAGCCATGTATTGAAGGCATCTTTTTGACCAACTCTTGTTGGTTCTGTCT 
TGAACCATTGTTAATCACTGTGCTGTAATTAGTATAGCTAAATCTTTTCCTTCCTT 
GCTCCTCCCCCAGCCCACCCCGTCTTCCCTTAACATTTTTTCAGGGGGGGTTGGGA 
GTGGTTTCATTTTAATGTGAGTGGATGTTTTGATAGTTGTAAGGAAAAAATGCATT 
TCAGACACATTTCACACATGAGCTATTTTCTTACACAGTATGTCTTATTGGTAATA 
AG AAT G T AAT T CAT 
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Sequence ID 722 SEQ ID NO: 309 

CNTTCCNTAAGAATACAAAAAATTAGCTGGGCGTGGTGGCAGGCGCCTGTAATCCC 
ATCTACTCAGGAAGCTGAGGCTGGAGAATCGCTTGAACCCGGGAGGCGGAGGTTGC 
AGTGAGCAGAGATCACGCCACTGCAGTCCAGCCTGGGCAACAGTGCGAGACTCTGT 
CTCAAAAAAAAAATAAATAAATTACCTGGGTGTGGCAGCGCGTGCCTGTAATCCCA 
GCTACCCAGGAGGCTGAGGCAAGAGAACTGCTTGAACCCAGGAGGCAGAGGTTGCA 
TGGAGCTGAGATGGCGCCACTGCACTCCAGTCTGGTGACAGAGTGAG 

Sequence ID 724 SEQ ID NO: 310 

CTCTCTACTAAAAATACAAAAATTAGCTGGGCACGGNGGTGCATGCCTGTAAACCC 
AGCTACCAGGTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCAGGGAGTCGG 
AGGTTGCGGCGAGCTGAGATCATGCCACTGCACTGCGGCCTGGAGACAAGAGCAAG 
ACTCCGTCT C A A A A A A A AAA A A A A A A A AAA A A A A AAA AG AC N T C AC C T AAT T GC AG 
NGNGNGGACCTTATTTGGCTNTTAATTCAAACTATTAAAAATGTGAACN 

Sequence ID ^-& SEQ ID NO: 311 nt : 

260 

CGGGGTCTGTACCGGGCTGGCCTGTGCCTATCACCTCTTATGCACACCTCCCACCC 
CCTGTATTCCCACCCCTGGACTGGTGGCCCCTGCCTTGGGGAAGGTCTCCCCATGT 
GCCTGCACCAGGAGACAGACAGAGAAGGCAGCAGGCGGCCTTTGTTGCTCAGCAAG 
GGGCTCTGCCCTCCCTCCTTCCTTCTTGCTTCTCATAGCCCCGGTGTGCGGTGCAT 
ACACCCCCACCTCCTGCAATAAAATAGTAGCATCGG 

Sequence ID 727 SEQ ID NO: 312 

CTGAGTNTAGAAATGATGCCATTAATACTGATTGCAAAAACATTACAACTCAGTAC 
TGCAGCTTTCATTCAAATAGGTTATATGTATAAACTGAGTTCAACAATATTGTATT 
T GAG AT GG T AAAGT T AAAGAAAT GC AAT AAT G T AAAT AAT AC T T AAGAAAAT AAG A 
T C T C AGG AAAC T G TAT AT AC T C T G T AC T T T T AT GC AAC T T T AT C AG AT CAT T T C AG 
TATATGCATCAAGGATATAGTGTATATGACATGAACTTTGAGTGCAAAAACTGTAC 
TATGTACCTTTTGTTTATTTTGCTGTCAACATCTAAATAAAGGTTTTTTTGTTTGT 
TTTTTGTTTTTT T AAT TGT T T TGT T T T AAAGAT TGT T T T AAT T AAT T AAAAAAT T A 
ATTGTTTTAATTAAACAATTGTTTAATTGTTTTAAAGTCGCCAGGCTGAGGCAGGT 
GAATCACAAGCTTAGGAGTTGGAGGCTAGCCTGCCAACATGGTGAAACCCCGTCTC 
T AC T A A A A A T A C A A A A A A A T T AAC TGGGTGTGGG 
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Sequence ID 728 SEQ ID NO: 313 

CCCATCTGCACCAGTACACAGGCAGGCATTATCATTCTTCACCTACTTTTTAAATA 
GTGGCAACTTGGGATTCATTCTGGTGATTCTGAACCTTGCCTCATAGCTTAAAGTA 
T AAAAAAG AT T C AAG AG C AG T GAG GTTTGTTCTTT C C AG T G AAT G G T G G AC T GAG T 
GGTGCGAGGTGGAGGGCTAACAAGAGGAAAGAACTACATTCTTCAGAATACAGTGA 
TGAAAATTCATTTTGAAACTCAAATATTTTCATTTTGGATATTCTCCTGTTTTTAT 
TAAACCAGTGATTACACCTGGCCATCCCTCTAAATGTTCTAGGAAGGCATGTCTAT 
T G T GAT T T T GAT G AAG AC AG AAT TATTTTTCTCT G T AG AAAC AC AG AT AC C AC T T T 
ATCAGGGGAAGTTAGTCAAATGAAATGGAAATTGGTAAATGGACAAAAGCTAGCTA 
GTAAAAAGGACGACCCAGCAACATGCTTTAACCCCATTGTATGTTTGTGGAAAGAG 
C AT AGT T T AAC AT C T T GAGAAAT T T GGGAC AT AAAAGT T T T C ATNGGT AGAC AGT T 
CATGGCAGTATATGAATTGACATAATGGAAATAATCTGATTTTATTTTTACAACTA 
ACATCCTTTCCCC 

Sequence ID — ■ 736 SEQ ID NO: 314 nt : 

641 

GGAATTCCAAGTGCTTGGGGATAATGATACCTCTGACCTTTCTTCCTTTTGGGAAG 
TACTTGAGTGTGCAGCTGCATGAGGCCTCAGCAGGAGAGAGATTTTAGGTCCAAGA 
AGCTATACCAGTAGGACAAGGCAGGAAAATACTACACTTTCAGGATCAAGCCCCTC 
TGACTCTCATTTGGAAACTGGATGTTTGCTAAGCACCTGCTTCTTAAGGATGCCGA 
GGGATTTAATGATACTCCCAGAAACCTGGAGAGATTAATGGGGCCTATGGAGAAGT 
GCTCTGAACTCAGTGTTGGGACTTGAATAAAATTAACCATTGTCATGTTTTCAGAA 
CAACTAAGCTGTTTTATATTTCATGTGCATGAAAGCCCTAGAACTAAGTTGTGTTA 
TTTCCAGAAATGAAATAGATCCCACAGTTAGATGATGTGGCCATTAGGAAGTACCA 
AATTTATAAAAATCACTGGAGGTCTGTCTGAGCAGTACCTAATAAAATATAGTATA 
CTGAAAGTGAACAGATACTTTGTCTCTTTCTTTGGCTGCTTGATCTTTATCTGTGT 
CTGCCGTACAGTGCACCCTTAAAGTATTCTACACCAGTGCTTCTCAAACTGGAAAT 
GTGCATGTAAGTCACCCANGGGTCT 

Sequence ID 739 SEQ ID NO: 315 

TGCATGCCCATAGTCCCAGCTATTTGGGAGGCTGAGGCAGGAAAATCGCTTGAACC 
CGGGAGCCAGAGGTTGCAGTGAGCCGAGATCGCACTCCAGCTTGGCGACAGAACAA 
G AC T C T G T C T C AAAAAAAAAAAAAAAAG AAAT CTTGGGATCCT G AAC CCCTTACTC 
GAAGGGCTAAGGTAGCATCTCAGCATGTCTTATTCGAGACTTCGTANAACCAGACC 
TGCTGTTTGTAGATGTTAATTAATCAAACCTTTCTCTACTCATTCTGGACCAGTTA 
AGGTTTTCTCCTTCTCCGTATGAGTTTTGATTTTCGTCCTCCTTGGTTGGAGATCA 
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CACTTTGGTCTGCTGCTAAGTTGGATGCCTCCCACTGTCTTTCCCTAAGTCTAGGG 
CTTCANACCCCAGTGTGGGGAGAGGGACTTTCGTTTCCTGCCCCTCACCACATCAG 
AC AC AGGC AGGC AAGAAT AAGAT GGCC AAAAGGCCGAT GAAC T T C T T GACC T AGCC 
TGGGACATTACCTGTTACTAGGTGGACTTCACTGCCTGTGAATGGAAGCTGAAGGG 
CTGTTTTTTTGGTT T GT AT T T GGAC AGGCC AGGC T T AN AGAGGGAGAGAAC T GGGC 
TACTCTTCAGCAGTGATCTTTAAAATGCC 

Sequence ID 747 SEQ ID NO: 316 

CAGAGTGCAAGACGATGACTTGCAAAATGTCGCAGCTGGAACGCAACATAGAGACC 
ATCATCAACACCTTCCACCAATACTCTGTGAAGCTGGGGCACCCAGACACCCTGAA 
C C AGGGGG AAT T C AAAGAGC T GGT GCGAAAAGAT C T GC AAAAT T T T C T C AAG AAGG 
AGAAT AAGAAT GAAAAGG T C AT AGAAC AC AT C AT GGAGGAC C T GG AC AC AAAT GC A 
GAC AAGC AGC T GAGC T TCGAGGAGTT CATC AT GCT GAT GGCGAGGCTAACCT GGGC 
CTCCCACGAGAAGATGCACGAGGGTGACGAGGGCCCTGGCCACCACCATAAGCCAG 
GCCTCGGGGAGGGCACCCCCTAAGACCACAGTGGCCAAGATCACAGTGGCCACGGC 
C ACGGCC AC AGT CAT GGT GGCC ACGGCC AC AGCC ACT AAT CAGGAGGCC AGGCC AC 
CCTGCCTCTACCCAACCAGGGCCCCGGGGCCTGTTATGTCAAACTGTCTTGGCTGT 
GGGGCTAGGGGCTGGGGCCAAATAAAGTCTCTTTCCTC 

Sequence ID 7-^7- SEQ ID NO: 317 nt : 

583 

GAACCCTGCGGAGGGACTTCAATCACATCAATGTAGAACTCAGCCTTCTTGGAAAG 
AAAAAAAAGAGGCTCCGGGTTGACAAATGGTGGGGTAACAGAAAGGAACTGGCTAC 
CGTTCGGACTATTTGTAGTCATGTACAGAACATGATCAAGGGTGTTACACTGGGCT 
TCCGTTACAAGATGAGGTCTGTGTATGCTCACTTCCCCATCAACGTTGTTATCCAG 
GAG AAT GGGTCTCTTGTT G AAA T C C G AA A T TTCTTGGGT G AAAAA T AC A T C C G C AG 
GGTTCGGATGAGACCAGGTGTTGCTTGTTCAGTATCTCAAGCCCAGAAAGATGAAT 
T AAT C C T T G AAGG AAAT GAC AT T G AGC T T G T T T C AAAT T C AGC GGC T T T GAT T C AG 
CAAGCCACAACAGTTAAAAACAAGGATATCAGGAAATTTTTGGATGGTATCTATGT 
CTCTGAAAAAGGAACTGTTCAGCAGGCTGATGAATAAGATCTAAGAGTTACCTGGC 
TACAGAAAGAAGATGCCAGATGACACTTAAGACCTACTTGTGATATTTAAATGATG 
C AAT AAAAG AC C T AT T GAT T T GG 



Sequence ID ^^- SEQ ID NO: 318 

424 



nt : 
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CTTGGCTCCTGTGGAGGCCTGCTGGGAACGGGACTTCTAAAAGGAACTATGTCTGG 
AAGGCTGTGGTCCAAGGCCATTTTTGCTGGCTATAAGCGGGGTCTCCGGAACCAAA 
GGGAGC AC AC AGC T C T T C T T AAAAT T GAAGGT GT T T AC GCC C GAG AT G AAAC AGAA 
TTCTATTTGGGCAAGAGATGCGCTTATGTATATAAAGCAAAGAACAACACAGTCAC 
TCCTGGCGGC AAAC C AAAC AAAAC C AG AG TCATCTGGG G AAAAG T AAC TCGGGCCC 
ATGGAAACAGTGGCATGGTTCGTGCCAAATTCCGAAGCAATCTTCCTGCTAAGGCC 
ATTGGACACAGAATCCGAGTGATGCTGTACCCCTCAAGGATTTAAACTAACGAAAA 
ATCAATAAATAAATGTGGATTTGTGCTCTTGT 

Sequence ID — 764 SEQ ID NO: 319 nt : 

626 

GAT TTTTTTTTTTTTTTT GAGAT GGAGT C TTTCTCTGTCGCC C AGGC T GGAG T GC A 
GTGGTGAAATCTCGACTCACTGCAACCTCCGTCTCCTGGGTTCAAGCAATTCTCCT 
GCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACCAGCCACCACGCCCGGCTAAT 
TTTTGTATTTTTAGTAGAGACAGGTTTTCACCATGTTGGCTAGGCTGATTTTGAAC 
TCATGACCCCAAGTGATCTGCCCGCCTCGGCCTCCCAAAGTGCTGGAATTACAGGT 
GTGAGCTACCACTCCCAGCCAATGATTACATTTATAAGGTAAAATAACTTGTGCCA 
ATCTGTACAAGTGAATTCAGATTTAAAATTTTAATTGTAAAAAGATATCCAGGTGA 
TATTTCTCCCTGAATAATTTAGTTTCCTTTTCTATTTCTTGATATAAAAGTACTCA 
GCATTGAAGTAATTGCTATCTTCACATTTCTTCCTATTTGAGCTGTCTAAATAAGT 
AGTCCTACATATTTTCCCCCCAACACAAAAAACCCAGAAAAGAATTATTTTATACT 
GGATTTTTTTGGTTGTAGCAGGAACCTAAAGGNGCCAATTGTAACATGCATGTTCT 
TTTTGGCAAA 

Sequence ID 766 SEQ ID NO: 320 

GTCCATCCTGCAGGCCACAAGCTCTGGATGAGGAACTTGAGGCAAGTCACCAGCCC 
CTGATCATTTCGCCTAAAAGAGCAAGGACTAGAGTTCCTGACCTCCAGGCCAGTCC 
CTGATCCCTGACCTAATGTTATCGCGGAATGATGATATATGTATCTACGGGGGCCT 
GGGGCTGGGCGGGCTCCTGCTTCTGGCAGTGGTCCTTCTGTCCGCCTGCCTGTGTT 
GGCTGCATCGAAGAGTAAAGAGGCTGGAGAGGAGCTGGGCCCAGGGCTCCTCAGAG 
CAGGAACTCCACTATGCATCTCTGCAGAGGCTGCCAGTGCCCAGCAGTGAGGGACC 
TGACCTCAGGGGCAGAGACAAGAGAGGCACCAAGGAGGATCCAAGAGCTGACTATG 
CCTGCATTGCTGAGAACAAACCCACCTGAGCACCCCAGACACCTTCCTCAACCCAG 
GCGGGTGGACAGGGTCCCCCTGTGGTCCAGCCAGTAAAAACCATGGTCCCCCCACT 
TCTGTGTCTCAGTCCTCTCAGTCATCTCGAGCCTCCGTTCAAAATGATCATCATCA 
AAAC T T AT GT GGC T T T T T GAC C T T T GAAT AGGGAAT T T T T T AAAAT T T T T T AAAAA 



TT 
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Sequence ID 768 SEQ ID NO: 321 

CCAGCGCAGGGGCTTCTGCTGAGGGGGCAGGCGGAGCTTGAGGAAACCGCAGATAA 
GTTTTTTTCTCTTT G AAAG AT AG AG AT T AAT AC AAC T AC T T AAAAAAT AT AG T C AA 
TAGGTTACTAAGATATTGCTTAGCGTTAAGTTTTTAACGTAATTTTAATAGCTTAA 
GAT T T T AAGAG AAAAT AT GAAGAC T T AG AAGAGT AGC AT GAGGAAGGAAAAGAT AA 
AAGGTTTCTAAAACATGACGGAGGTTGAGATGAAGCTTCTTCATGGAGT AAAAAAT 
GTATTTAAAAGAAAATTGAGAGAAAGGACTACAGAGCCCCGAATTAATACCAATAG 
AAGGGCAATGCTTTTAGATTAAAATGAAGGTGACTTAAACAGCTTAAAGTTTAGTT 
T AAAAGT T GT AG GT GAT T AAAAT AAT T T GAAGGC G AT C T T T T AAAAAG AGAT T AAA 
C C G AAGG T GAT T AAAAG AC C T T G AAAT C CAT G AC G C AG GG AG AAT TGCGTCATTTA 
AAGCC T AGT T AACGC AT T T AC T AAACGC AGACGAAAAT GGAAAGAT T AAT T GGGAG 
T GGT AGGAT GAAAC AAT T T GGAGAAGAT AGAAGT T T 

Sequence ID 773 SEQ ID NO: 322 

GAGGAAAGGGG AGT T AAT AT T T AGT GGAC AGAAT T T C AGT T T T AC AGAT GAAAAGA 
GTTCTGGAGATAGACGGTGTTGATAGTTGCACAGCAGTGTGAATGTGCTCATTGTT 
ACCGAACTTAAAAATGTTTAACATAGTATTATGTGATTTTTATTTTGCCACTTAAA 
AAAAAAGAAT G AAGT AC T GAT AC AT GC T AC AAC AT GGGT GAGC T T T AAAT AC AT T C 
T GC T C AGT GAAATAAGC C AGAT GCAAAAGAT C AC AT AT TAT AT AAT C C AC T TAT AC 
GAGAT ACC T AGAAT AGGC AAAT TC AT AGAGACAGAAAGT AGAAT AGT GGT TCCC AG 
GGGCTGGGGACAAGGGGGCAGTGAGAGATTGAGAGTTATTATTAATGCGTACAGAG 
TTTCAGTTTGGGCTGATAAAAAAGTTCTGAAGATGGATGGTGATGATGGTTGTACA 
TCAATGTGAGTGTAATTACCGCCACTGAACTGCCCTTAAAAACGTTTAAAAGAGTA 
AATTTTATGTTGNGTATATTTTACCATAAT 

Sequence ID 776 SEQ ID NO: 323 

TTTTTTTTT C AT AAGAG GC AAGT AC AAG AAAAAG C T T AAT T AC T T T AAC T T C T AAG 
T AGT TTGGAATCT AAAT AAAT AGGAGTT ACC AAAT AT ATGCGCTTCTGTGAAT AGT 
TTTCCCCCACATGTTTATTTATATTTTTGCATCTCATCAAACCTAACAGATTCTAA 
AGTCTCTGGTGATAATGACAATATCTGCTACGGAGAGACTAGCCTGGGGGAAGAGG 
ATCTCCCTGAACAAGGATAGCGGAGTTGCTGCAGCTTTCAAATGAAGCTGGACATT 
TAGCTGCGGGGGTAGCACCCTTTGATCAAGGCAGCCCAAAGATGAGTTTCAGGGAT 
G G G AC T G AC AG AAG AG AAAAG T T C T T C C C AG CCCTTTCT AC TTTTTCTCTTTGTTT 
CTCAGGCTTCTGGCCGTCTTCAGTTTTCACAAGTTTCACTCTCAACCCTAAACAGT 
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ACTTCTGTGAAGTACCCTTTGGCCCCTCGTTTTCAGCTCCTAAACTCACCTGGAAA 
TAGATGTCAATCTAATTTTGGGTCTGACTAGTGCAGTAGGCATTTTTGGTGA 

Sequence ID 782 SEQ ID NO: 324 

C T C AC AC AG AAC AAAAAT G AAT GAG TGTGGCTGTG T GC C AC T AT C AC T G T G T C T AC 
AAAAACAGCCAGTGGGCCTGATTTGGCCCTTGGCTGCAGTGCGCCCGTCTCTGTTT 
TTGAGGAATAAAATCGCATCATTTCATATGGCTAATGCAATTTTTTTCCCATCTGG 
AAGCAACATCTGATTGGACTCATCTTGTATGGTGCTTGTTACAGTCTCTGTAAATG 
GGAGAGGGTCCGAGAATAGCTCTTCCTGTTTTCATCAGGACTGTTTTTAGGGATGG 
CAAAGAAGTCAGTGTGTCCAGCCTGTGTCCTCCTCACCACGTGGCTGATTCCTGAA 
TCTGCATGTGCANCACNTGCCGTTGTCTGGGGCATGATCTGTGTGA 

Sequence ID 7-&^ SEQ ID NO: 325 nt : 

556 

CTTTTCTCTGGGTATAGATTTACCCTAGCACCTATCTCATTATATTGAATTTTCCA 
GCATATTTAAATAAACTATTAATTAGTCACACTATTTCTTAAAAGTCACACTATCA 
ACTAATCGTGACCGCAATTATCTAGGGGTGATAATCTGCTGAGTCTACTCTTTAAA 
TACACTGGGACCCAGCATATTGAGTTATATTGGCACAGAAACTTCACTCTGGGTAT 
AGATTTACCCTAGTACCTTGCCGGCAGGATCCTATTATTCATGGTTGTACAAGCAA 
GGTTCAGGGAAGAGGCTGGCACAGAGAAGGTACCTGGTAACTGTTGTTTGAGGCTG 
AATTCAGCTCAACTCAGCTCCAGTAGAGATGGTGTCCCCTTCTCTACCGTGTTGAG 
ATAGTGTGCAGTCCCTTCCTAAGGGCTGTTACCCACCGCAATAGGACTTGTCAGCT 
TCAACTTTTAAATTTCTCTGCTCCCGCTGGGACCCACCCGCTTCAAAAATCATCAT 
GGNGGNTTTAGCACCAATTTAGTAAACACAAACTGTCTGAAATATTTTGGAT 

Sequence ID 796 SEQ ID NO: 326 

GAACATTCAAGATAGTGAGAGGAAGAAAAAGATATGGCTGTACGGGACCGAGGTCT 
CTTCTATTATCGCCTCCTCTTAGTTGGCATTGATGAAGTTAAGCGGATTCTGTGTA 
GCCCTAAATCTGACCCTACTCTTGGACTTTTGGAGGATCCGGCAGAAAGACCTGTG 
AATAGCTGGGCCTCAGACTTCAACACACTGGTGCCAGTGTATGGCAAAGCCCACTG 
GGCAACTATCTCTAAATGCCAGGGGGCAGAGCGTTGTGACCCAGAGCTTCCTAAAA 
CTTCATCCTTTGCCGCATCAGGACCCTTGATTCCTGAAGAGAACAAGGAGAGGGTA 
CAAGAACTCCCTGATTCTGGAGCCCTCATGCTAGTCCCCAATCGCCAGCTTACTGC 
TGATTATTTTGAGAAAACTTGGCTTAGCCTTAAAGTTGCTCATCAGCAAGTGTTGC 
CTTGGCGGGGAGAATTCCATCCTGACACCCTCCAGATGGCTCTTCAAGTAGTGAAC 
ATCCAGACCATCGCAATGAGTAGGGCTGGGTCTCGGCCATGGAAAGCATACCTCAG 
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TGCTCANGATGATACTGGCTGTCTGTTCTTAACAGAACTGCTATTGGAGCCTGGAA 
AC T C AG AAT GC AG AT C T T T T G T GAACAAAAT GAAGCAAGAAC C GG AG ACNC T GAAT 
AGT T T TAT TTCTGTAT T AAAAAC T GNGAT T GGAAC AAT T GAAGA 

Sequence ID 8Q1 SEQ ID NO: 327 

CCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTACCCAAATAAAGTAT 
AGGCGATAGAAATTGAAACCTGGCGCAATAGATATAGTACCGCAAGGGAAAGATGA 
AAAATTATAACCAAGCATAATATAGCAAGGACTAACCCCTATACCTTCTGCATAAT 
GAATTAACTAGAAATGAGGATTCTGACCTTGACTTTGATATCAGCAAATTGGAACA 
GCAGAGCAAGGTGCAAAACACAGGACATGGAAAACCAAGAGAAAAGTCCATAATAG 
AC GAG AAA T T C T T C C AAC T C T C T GAAAT GGAGGC T TAT T T AGAAAAC AGAGAAAAA 
G AAG AGG AAC G AAAAG AT GAT AAT GAT GAT GAG T C AGG T AAAAG T T C C AG AAAT G T 
GAACAACAAAGATTTTTTTGATCCAGTTGAAAGTGATGAAGACATAGCAAGTGATC 
AT GAT GAT GAGCTGGGTTCAAACAAGAT GAT GAAAT TGCTGAAGAAGAAGCAGAAG 
AAGG AAGC AT T T C T GAAAT AT GAAT G AAAAAAAT T AC AT C T T T AG AAAAAG AG T T A 
TTAGAAAAAAGCCTTGGCAGCCGTCNGGGGGAAGTGACGCACAGAAGAGACCAGAG 
AATAGCTTCCTGGANGAGACCCTGCACTTTACCCATGCTGCTGGATGG 

Sequence ID - 8 0 8 SEQ ID NO: 328 nt : 

641 

CCGGGTTTTAGTATTTAACCAAGAGCCTTTTAAATATTGAAAACCCATAGTTCAGA 
AAATGTTAGTATTGCTGCCCTTCTTCACATAAATTTTTTTTTAAATTATACTATTA 
TTTTGCTTAATTTTATATTGGGTTAAAACAACCTTCAAGAAGGTTAACTAGGAAAG 
AAGACCTTTTTGTTTTATTTTTACTATTTATATATAGAAGACAAATCAGCATTTGG 
TGATAGTTTTACATGACCAGTTATCAAACGGTCATAGTATGAAGTGTGCAGTTGTT 
CATTATTAGTAAATTATGTTTGATTTTTAAACTATTTAGTACTAATAGTTGAGATG 
A AAAC T G AAG A AAAA T G C C AA T G T G AC G T T T G T G T A T AG CTAGCCTT AAAAAAC T T 
CCCATGTTTTTAGGTGACTTTTTTCCCCCTCTTAGTACTCTGGAGAAACAATGAAG 
ATGGGCCATCTCAATTCCAGATGTAAACAAAAAGTAATTTTTATTTCAACATTTAA 
TGTAACTGCTATTATTGNGGATTCTTGNCTTGNGTATTTTCTTTCCCTTATTCAAG 
T AAT AT AGAAT AAC TTTCCTTAAAATGATTTGATCCAAGATACGT CAT TTCTGTAT 
TGGCAAAATGCCNCTATTAAAGTGT 



Sequence ID 8^- SEQ ID NO: 329 nt : 

132 

GTTAAAGTGATACATTTTTATACCAAATGTGTTTATTTTTTTGTGCAAGTAATCCT 
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TAAAAT T GC AAT T GT AT TAGGTGTT A A A A T AAA G T T T T T AAAAAAT T A A A A A A A A A 
AAAAAAAAAAAAAAAAAAAA 

Sequence ID 817 SEQ ID NO: 330 

GACAACCTTAGCCAAACCATTTACCCAAATAAAGTATAGGCGATAGAAATTGAAAC 
CTGGCGCAATAGATATAGTACCGTAAGGGAAAGATGAAAAATTATAACCAAGCATA 
ATATAGCAAGGACTAACCCCTATACCTTCTGCATAATGAATTAACTAGAAATAACT 
TTGCAAGGAGAGCCAAAGCTAAGACCCCCGAAACCAGACGAGCTACCTAAGAACAG 
CTAAAAGAGCACACCCGTCTATGTAGCAAAATAGTGGGAAGATTTATAGGTAGAGG 
CGACAAACCTACCGAGCCTGGTGATAGCTGGTTGTCCAAGATAGAATCTTAGTTCA 
ACTTTAAATTTGCCCACAGAACCCTCTAAATCCCCTTGTAAATTTAACTGTTAGTC 
C AAAGAGG AAC AGC T C T T T GGAC AC T AGGAAAAAACC T T GT AGAG AGAG T AAAAAA 
T T T AAC AC C C A T AG T AG G C C T AAAAG C AG C C AC C AAT T AAG AAAG C G T T C AAG C T C 
AACACCCACTACCTAAAAAAATCCCAAACATATAACTGAACTCCTCACACCCAATT 
GGACCAATCTATCACCCTATAGAAGACTAATGTTAGTATAAGTAACATGAAAACAT 
TCTTCTNCGCATAAGCCTGCGTCAGATTAAAACACTGAACTGACAATTAA 

Sequence ID 8-2^ SEQ ID NO: 331 nt : 

370 

AAAG AGC T CC C AAAT GC T AT AT C T AT T C AGGGGC T C T C AAG AAC AAT GG AAT AT C A 
T CC T GAT T T AN AAAAT T T GGAT GAAGAT GGAT AT AC T C AAT T AC AC T T CGAC T C T C 
AAAGCAATACCAGGATAGCTGTTGTTTCANAGAAAGGATCGTGTGCTGCATCTCCT 
CCTTGGCGCCTCATTGCTGTAATTTTGGGAATCCTATGCTTGGTAATACTGGTGAT 
AGCTGTGGTCCTGGGTACCATGGCTGGTTTCAAAGCTGTGGAATTCAAAGGATAAA 
TTAATGAAGAAAACAAGCGGAGCTGAAGAAGAAAGTACAATATGGTGCTGTCTTCC 
T AAT G AAAT AAAT T C AC T AAAT GGAC AT TAAAAA 

Sequence ID 825 SEQ ID NO: 332 

AGACTCGAGCAAGCTTATGCATGCATGCGGCCGCAATTCGAGCTCGGCCACTTGGC 
CAATTCGCCCTATAGTGAGTCGTATTACAATTCACTGGCCGTCGTTTTACAACGTC 
GTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCT 
TTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTT 
GCGCAGCCTGAATGGCGAATGGAAATTGTAAGCGTT AAT AT TTTGTT AAAAT TCGC 
GTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAA 
T C C C T TAT AAAT C AAAAG AAT AG AC C G AG AT AGGG T T GAG TGTTGTTC C AG T T T G G 
AACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACCGT 
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CTATCAGGGCGATGGCCCACTACGTGAACCATCACCCTAATCAAGTTTTTTGGGGT 
CGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAAAGCT 
T G AC G G G G AAAG C C G G C G AAC G T G G C G AG AAAG G A AG G G AAAAAAG C C AAAN G GAG 
CCGGCGCTAGGGCCTGGCAAGTGTACGGGCACGCTGCGCGTAACCACCCACACCCC 
GCCGNGCTTAATGCCCCNTTCAGGGCGCGTNCTGATGCCGNATTTTNTCTTACNCA 
TNTGTGCNGGNTT 

Sequence ID 833 SEQ ID NO: 333 

T AAAAT AAT GGC AAAAAAC AAAC AAAAAAC AAGT T C T C T AAAC AG AAAGGAAAT T A 
CTAAAGAAGGAATCTTGAAATAACAGGAAAGAGGAAATACCACAGTAGGCAACATT 
AT GGG T AAAT AAAAC AGAC TTTCCTTCTT TAG T T T CC T AAAAT AT GT T T GAT GAT T 
AATGCAAAAATTACAATATTTTCTTATGTAGCACTAAAGGTATGTAGAGAAAATAT 
T T AAGAT AAT T GT AC TGT AAGCGGGAGATGACAGT GAC AT AAAGGCAACGT T T T T A 
TACTTCACTCAAACTTTATGTATTAATGTAATCCATAAAGCAACCAAAAAAGCTAT 
ACTAAGTACATTCAAAAACACAATAGATAAACCAAACAAAATTCTAAAGGATGTAC 
AAGT AACC C AC T GGAAGC T GC AAAAAAT GT AAAC AGAAAC T AAAAAC AGAGAAT AA 
ATGAAAAATTAAAAACGAAATGGCAGACTTAGGCCCTAATATACAAATTATCACAT 
T AAAT AT AAAT GGTCT AAAT AC ACC AAC TGTAAGACAGAGATTAGCAAAGTCGATT 
T AAAAAC AT GAC T C AAC T AC G T GC T GT C T AC AAGAAAC T C AC T T C AAAT AT ACC AA 
G AT AGG AAGG T T G AAAG T AAAAC G AT GG AAAAAG AT GTATCATGT G AAC AT T AAT C 
AAAG G AAAG C AG GGGTGGCTATATT AAC AT C AG G T AAAAT AAAC T T T 

Sequence ID 8-3^7- SEQ ID NO: 334 nt : 

603 

TGAGGNTGGTCATGATGCANAAGCTACTCAAATGCAGTCGGCTTGTCCTGGCTCTT 
GCCCTCATCCTGGTTCTGGAATCCTCAGTTCAAGGTTATCCTACGCGGAGAGCCAG 
GTACCAAT GGGTGCGCTGCAATCC AGAC AGT AAT TCTGCAAACTGCCTTGAAGAAA 
AAGG AC C AAT G T T C G AAC T AC T T C C AGG T G AAT C C AAC AAGAT CCCCCGTCT G AGG 
AC T GAC C T T T T T C C AAAG AC GAG AAT C C AG GAC T T G AAT CGTATCTTC C C AC T T T C 
TGAGGACTACTCTGGATCAGGCTTCGGCTCCGGCTCCGGCTCTGGATCAGGATCTG 
GGAGT GGC T TCCTAACGGAAATGGAACAGGATT ACC AAC T AGT AGAC GAAAGT GAT 
GCTTTCCATGACAACCTTAGGTCTCTTGACAGGAATCTGCCCTCAGACAGCCAGGA 
CTTGGGTCAACATGGATTAGAAGAGGATTTTATGTTATAAAAGAGGATTTTCCCAC 
CTTGACACCAGGCAATGTAGTTAGCATATTTTATGTACCATGGNTATATGATTAAT 
C T T GGGAC AAAGAAT T T T AT AGAAAT T T T T AAAC AT C T GAAAA 
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Sequence ID ^3-» SEQ ID NO: 335 nt : 

71 

AT T TAT C T AAT AT T T GGT T T A A T A A A A T G T G A A T A A T G AAAAAAAAAAAAAAAAAA 
AAAAAAAAAAAAAAA 

Sqcucncc 849 SEQ ID NO: 336 nt : 622 

TTTTTTTTTATTTTTTGAGAATGGAGTCTTGCTCTGCCGTCCAGGCTAGAGTTCAG 
TGGTGCGATCTCAGCTCACTGCCACCTCACCTCCTAGGTTCCAGAGATTCTTGTGC 
TTCAGCCTCCTCAGTAGTTGAGAATACAGGAACACGCCACCACGCCTAGCTAATTT 
TTGTATTTTTAGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCAAACTC 
CTGGCCTAAGTGACCCACCTGCCTCAGCCTCCCAAAGTGCTGGGATTATAGGCGTG 
AGTCATTGTCCCCAGCCGGATGTTTTCATCTTGATTTGCCTTAGTTTCTAAATCTC 
ATCCTCTCCATTTTCTCCTGTTAGTAGTCACAGAGAACCAAATTCTGTCAAGTTAT 
GAAACTAAAGTCTCTCTTCCACAAGTCTTCCTGTGTTCTGCCTCAAGTGAACTTGA 
AAGAACATCAGTTTGTGGGAAGGTTGAAGACCGAATGATCTGCTGGGAAATCACTG 
AGGCATTGCCATTCTCTTGAGGAATTTCATTTTCATCGAAGTTTCGGTTTATATCC 
CTTTCTTGGTGAGTACTATTGCTGTTATGTAAATTAAATGAGTCGTCATCCTTCTT 
NTGAGC 

Sequence ID 8-^- SEQ ID NO: 337 nt : 

501 

GTGAAATCACTTTCATGGATTATTAATGGATTTAAGAGGGCATCAATCAGCTCAAC 
TCAAGATTTCATAATCATTTTTAGTATTTAGATTGTGCCTCAAAGTTGTAGTACCT 
CACAATACCTCCACTGGTTTCCTGTTGTAAAAACCTTCAGTGAGTTTGACCATTGT 
GCTCTTGGCTCTTGGGCTGGAGTACCGTGGTGAGGGAGTAAACACTAGAAGTCTTT 
AGTACAAAACTGCTCTAGGGACACCTGGTGATTCCTACACAAGTGATGTTTATATT 
TCTCATAAAGAGTCTTCCCTATCCCAAGGTCTTCATGATGCCAGTAGCCATATATG 
ATAAATTATGTTCAGTGATAACTTAGTTATCAGAAATCAGCTCAGTGGTCTTCCCC 
GC C AT GAT T C AC AT T T G AT GAG T T T T TAAAAAT CAAAG T GAT T T T GAAAAT C T C T A 
ATGGCTCAGAAAATAAAAACATCCAGTTTGTGGATGACTATATTTAGATTTCT 

Sequence ID 864 SEQ ID NO: 338 

TTGTGTTTTTAGGACTCCTTATCTAAATTAAGGCAGAGAAGTTACAGTATTTATAT 
CTGCATTAAATCTCAATTCCAGAAAAACCTTTTGAAAAATTATTTAATCCTCTGGA 
AAC T AT T GAT AT GAT AC AGGAGAAAT T T T CAGAAG T T T AT T GAAT AAT T T AAT AT C 
ATTTAATAGGACACTCTGGCTTGTATATAAGCAGATACGTTACTCAGACTTCTTGG 
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CTGTACTCTAAAATAATATATGTACTAGTCTCCTAAATATTACTAGCTCACCTTTC 
AAAATGCATACTAATATTTCAATGTCTTTCTTCAATTTGAAAAGCTCTTGAATATC 
T AC T T GT GAT AGCC C T AAGAGC T GAGAT AAT TAT T T CC AGGAGGT T GAAT C C C T G A 
T T C T T AAC T GT T C AGC AAT GC AT AAGC AAGAGAGAAT AT GAC AT AAGAGGAC CAT T 
TCTACATTAGCCATTTTTTTTCACAAGATACCTATGTGAATACAGGGCACCTGGGA 
GGGTAAGTGGAGGACTATTTCTAACTATATTTATAAGCACATACTGATATTGGTGA 
ATCAAAACCTACAGCAGTGCTTCTCAGATGGGAAGGGAGACAATGTGTAAGGAGAT 
CAGGAAT T CAT TAG 

Sequence ID ID NO: 339 nt : 

122 

CCANAATCCACTCTCCAGTCTCCCTCCCCTGACTCCCTCTGCTGTCCTCCCCTCTC 
AC G AG AAT AAAG T G T C AAGC AAG AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAAAAA 

Sequence ID 867 SEQ ID NO: 340 

TTTTTTTTTTTTTTTTTTTCAGAGTCACAGATATTGTATAGCTGAGGTAAGCATTT 
T AC AAC T T T T C AGAC AC AAGT AAGT AC AT AAAT AT TAT T T T AC AACC AAC AATNT T 
T AAT AT T T CC AC AT T GAANAAT AGAT GT GAT AAT T AAAT C T T T T AT AAGGT T T T AA 
AAAGAC AT GAAAC AT AAACC T AAT T AT AC AT AAAAGAAAAGAAT T T T AAAC AAGAG 
CTTATTGNGATGACATTACTCATAACTTTTACCTTTAAAACCTTTTCTTGGGTAGC 
TATTCAAAAGTAAAGACCACAAGTTTTGTTGCCCANATTTCTTATGTTTNGTATAT 
TTAAGCTCTTTATTTATTGAACAGATGNGTCATTAATTCATTNGGAGCATTACTAT 
TATCAGTAAAATTTGATTTTTTTTTCCCCTCAGTCATAGGTAAATCAGCTCCACCT 
GGAATTTCTAAGGACCCAGTTTTAGTCAATATTTTCAAGTAATCATGACCTCAGAA 
ATAGTCTTAATTAAGATAACAAATATTAGCCATCAAAATGGAACCAAGACAAGATT 
CTAATGTTTGTAAACAGTCAATCCATATTTATGAATATTAGCATATATTGGNGAAT 
AGTTAAGGCAAAAGGGTCTAGCAG 

Sequence ID 8^ SEQ ID NO: 341 nt : 

667 

TTGTGTTTTTAGGACTCCTTATCTAAATTAAGGCAGAGAAGTTACAGTATTTATAT 
CTGCATTAAATCTCAATTCCAGAAAAACCTTTTGAAAAATTATTTAATCCTCTGGA 
AACTATTGATATGATACAGGAGAAATTTTCAGAAGTTTATTGAATAATTTAATATC 
ATTTAATAGGACACTCTGGCTTGTATATAAGCAGATACGTTACTCAGACTTCTTGG 
CTGTACTCTAAAATAATATATGTACTAGTCTCCTAAATATTACTAGCTCACCTTTC 
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AAAATGCATACTAATATTTCAATGTCTTTCTTCAATTTGAAAAGCTCTTGAATATC 
TACTTGTGATAGCCCTAAGAGCTGAGATAATTATTTCCAGGAGGTTGAATCCCTGA 
T T C T T AAC T GT T C AGC AAT GC AT AAGC AAGAGAGAAT AT GAC AT AAGAGGAC CAT T 
TCTACATTAGCCATTTTTTTTCACAAGATACCTATGTGAATACAGGGCACCTGGGA 
NGGTAAGTGGAGGACTATTTCTAACTATATTTATAAGCACATACTGATATTGNTGA 
ATCAAAACCTACAGCAGTGCTTCTCAGATGGGAAGGGAGACAATGTGTAAGGAGAT 
CAGGAATTCATTAGTCACCTTTCAGATGGTTTAATGCATACAGCTGTACCG 

Sequence ID 870 SEQ ID NO: 342 

GGAGTTTGAGCAGATCCTTCAGGAGCGGAATGAACTCAAAGCCAAAGTGTTCCTGC 
TCAAGGAGGAACTGGCCTACTTCCAGCGGGAGCTGCTCACAGACCACCGGGTCCCC 
GGCCTTCTGCTCGAGGCCATGAAGGTGGCTGTCCGGAAGCAGCGGAAGAAGATCAA 
GGCCAAGATGTTAGGGACACCAGAGGAAGCAGAGAGCAGTGAGGATGAGGCTGGCC 
CATGGATCCTGCTCTCCGATGACAAGGGAGACCATCCCCCACCCCCGGAGTCCAAA 
ATACAGAGTTTCTTTGGCCTATGGTATCGGGGTAAAGCTGAATCCTCTGAGGATGA 
GACCAGCAGCCCTGCACCCAGCAAGCTAGGGGGAGAAGAGGAGGCCCAACCACAGT 
CTCCAGCTCCTGATCCGCCCTGTTCTGCCCTCCACGAACACCTTTGTCTGGGGGCC 
TCAGCCGCCCCAGAGGCCTGACTTAGGGGTCTGGCTGTGGAAGGATGTGTGGCCTC 
AAATGAGGACAGGGCTCCCGCCTTCACAGCCCTCGCCAGGGGTCTGCCCCAATCCT 
GGCCTGCATCAGGCAAGGACGGGGTCTCAGC 

Sequence ID &^ SEQ ID NO: 343 nt : 

642 

GCAAGTCTTCAGTATGTACATTTATCCCCTAGAAGAAGAAAAATTAGTTGTGCATG 
AAAAAGAAACATTAACTGCAAAGCTAAATGCTCACACTCTAAATCAGTGCTCTCCA 
AAGTACAGCAGGCGGGAAAAGAAAATGGTAGATTTTTTTCTTCCAATTACTTTAAC 
T TAT T C T T T T T AAT GGAC AC T T C AT AC AT AAAT AT AT T C AC AAT AT AT T AAT AT AT 
ACATAATGTATAAGCATACATATTGAATGTGCAGTCAAAAAATGTACTAATGGAAT 
GC T C T AC C AAAAC AAG T T C AC GTTCATCT G T AAAAT GGG AAT AAT AT T T T T AAAAG 
GCATACAGTCTGAACATTTTTAGATTATTCATAAAATCTATTCAGAAAGTTAAACT 
AAAAAATTTAACGTATGCCTATAACAAATTTTGTACTTAATGTAATTGNTTTTCAT 
CCTGAGATCTAATATCCTCGTTTTTAAGTAGAGCCACTTGTTTGCTACAGTTTAGT 
CAAAACGTTAACATTAGATGGGTAAAGTAATATGAAATCTTTCTACTACTCCAAAA 
T AGAAAAC AGAAC AT T AAAAAGAT AAAAAT T C AAAC AT AC T T AC C AGT AGAT T T T C 
AACTGNGCAAAAGCTCATTGCATGGG 
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Sequence ID 873 SEQ ID NO: 344 

GTTTTCCACCGTGAAGAGAACATTTCCTCTGGGAATGACAAAGCCCTCAGGAACNG 
C T T T TAT T T C T AT T GGAAGAT GCC C AT CAT AC T T C T GGC AGGAT AAAAT GAT AAAT 
TTATTTATTCAACAGATGATACTCAATTCCCTGCTGTTTTACTAAAGGTTCTTTAC 
G T T T TAT AGAAGC T AAAT T T AC T GT C AT AGAAAT T GC AAT T G T AG AT GT T AC T GT A 
ATCTAGTCAGAATATCCTT AT CCTTCT AAAAT AAAACTAGTT AAAAT T AT TAACAT 
ACGTACTGATATTAATTTTTAAGTTTAATGCTGCCACGTGCTTCTGCTAAGAACAT 
TTATCACTACAAGTGGCAGAAAATTCCAAACTCATCAAAACCAAACTGTTGCTTCT 
TCCCTGCTTTTTCAGAAAATGAGAAAGGATGACTTTATTCCAACATATTCTAAAAG 
TATTCCAAGAACACTACCTTTATTCTAAATTCGTTATTTTCACAAAATAAAGGCTG 
C AGAT T GAAAG AT AAAGG AT T GC T AT T AAAGAAC AAAAGAAAAC AAAAC CGAGAG A 
G AAG GAG AG C T AG G G AAA T C C C T G C AN A AN AAC C G AAT AN GGTCCCTCTATTCTGG 
GCCGGGGCCTGAAACTATGAAACAGGCCAACACAGAATCTTGGCA 

Sequence ID 875 SEQ ID NO: 345 

CCTCTGACTCGCTCAGCTCACCCACGCTGCTGGCCCTGTGAGGGGGCAGGGAAGGG 
GAGGCAGCCGGCACCCACAAGTGCCACTGCCCGAGCTGGTGCATTACAGAGAGGAG 
AAACACATCTTCCCTAGAGGGTTCCTGTANACCTAGGGAGGACCTTATCTGTGCGT 
G AAAC AC ACC AGGC T GT GGGC C T C AAGG AC T T GAAAGC AT C CAT G T GT GGAC T C AA 
GTCCTTACCTCTTCCGGAGATGTAGCAAAACGCATGGAGTGTGTATTGTTCCCAGT 
GACACTTCANAGAGCTGGTAGTTAGTAGCATGTTGAGCCAGGCCTGGGTCTGTGTC 
TCTTTTCTCTTTCTCCTTAGTCTTCTCATAGCATTAACTAATCTATTGGGTTCATT 
ATTGGAATTAACCTGGTGCTGGATATTTTCAAATTGTATCTAGTGCAGCTGATTTT 
AACAATAACTACTGTGTTCCTGGCAATAGTGTGTTCTGATTAGAAATGACCAATAT 
TATACTAAGAAAAGATACGACTT TAT TTTCTGGT AGAT AGAAAT AAAT AGCT AT AT 
CCATGTACTGNAGTTTTTCTTCAACATCAATGGTCATTGNAATGTTACTGATCATG 
CATTGGTGAGGNGGTCTGAATGTTCTGACATTAACAATTTTCCAT 

Sequence ID g^& SEQ ID NO: 346 nt : 

115 

AAACTTTTGTGGCAACAGTGCACT AAT TTGGATAATGTTTGTTCCCAAT AAAT TAA 
G AGC C AAAT T G T A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A 
AAA 



Sequence ID 8^8- SEQ ID NO: 347 

634 



nt : 
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GCCAGGCTTTGTGAATTACAGGACATTTGAGACAATCGTGAAACAGCAAATCAAGG 
CACTGGAAGAGCCGGCTGTGGATATGCTACACACCGTGACGGATATGGTCCGGCTT 
GCTTT C AC AG AT GTTTC G AT AAAAAAT T T T G AAG AG T T T T T T AAC C T C C AC AG AAC 
CGCCAAGTCCAAAATTGAAGACATTAGAGCAGAACAAGAGAGAGAAGGTGAGAAGC 
TGATCCGCCTCCACTTCCAGATGGAACAGATTGTCTACTGCCAGGACCAGGTATAC 
AGGGGTGCATTGCAGAAGGTCAGAGAGAAGGAGCTGGAAGAAGAAAAGAAGAAGAA 
ATCCTGGGATTTTGGGGCTTTCCAATCCAGCTCGGCAACAGACTCTTCCATGGAGG 
AGATCTTTCAGCACCTGATGGCCTATCACCAGGAGGCCAGCAAGCGCATCTCCAGC 
CACATCCCTTTGATCATCCAGTTCTTCATGCTCCAGACGTACGGCCAGCAGCTTCA 
AAAGGCCATGCTGCAGCTCCTGCAGGGACAAGGACACCTACAGCTGGCTCCTGAAG 
GAGCGGAGCGACACCAGCGACAAGCGGAAGTTNCTGAAGGAGCGGCTTGCACGGCT 
GACGCAGGCTCGGCGCCG 

Sequence ID 879 SEQ ID NO: 348 

GTTGCCGGGTCCTGTGATAACTCTGTTTAACATTTTGAGGAACTGTTGAATGGTTT 
TTCACAGCAGCTGCCTCATTTTTTATTCCCATCAGCAGTACTTCTTGGTTCTAATA 
CCTCCACGTTCTCGCCAACACTTGTTGTTGTCTGTAATTTCGTTGTTAGCCATCCC 
AGTGGGGATGAAGTAGTATCTTACTGTGGTTTTCAGTTGCGTTTCCCTGATAATTA 
ATGATGGTGAACATCTTTTCATGTTCTTGTTGGCCATTTGTATGTCTTCTTGGGAA 
AAAAAAAAT G T C T G T T C AAAT C C T T T AC A A AG TAT TTATTTTTTATGT C AAC AAT A 
T AAC C AC T C AG T AC AC T GC T T T T T ANAC AAT GAT C T T T T AAAGGT T T GT T T AC AAC 
ATTTAGCACTTGAAATTTTAAGGTTATGCCCTCAAAAAAATTGCTGAGGGAGCTAA 
GCTATGAAGATGCAAAGGCATAANAATTATACAATGGACTTTGGGGGAATCCAGGG 
AAAGGGTGGGAGGGGGGTGANGGA 

Sequence ID 881 SEQ ID NO: 349 

TCGACTCTGATTTTTTTTTCTCCTTCCTCGCAGCCGCGCCAGGGAGCTCGCGGNGC 
GCGGCCCCTGTCCTCCGGCCCGAGATGAATCCTGCGGCAGAAGCCGAGTTCAACAT 
CCTCCTGGCCACCGACTCCTACAAGGTTACTCACTATAAACAATATCCACCCAACA 
CAAGCAAAGTT TAT TCC TACT TTGAATGCCGTGAAAAGAAGACAGAAAACTCCAAA 
TTAAGGAAGGTGAAATATGAGGAAACAGTATTTTATGGGTTGCAGTACATTCTTAA 
TAAGTACTTAAAAGGTAAAGTAGTAACCAAAGAGAAAATCCAGGAAGCCAAAGATG 
TCTACAAAGAACATTTCCAAGATGATGTCTTTAATGAAAAGGGATGGAACTACATT 
CTTGAGAAGTATGATGGGCATCTTCCAATANAAATAAAAGCTGTTCCTGAGGGCTT 
TGTCATTCCCAGAGGAAATGTTCTCTTCACGGTGGAAAACACAGATCCAGAGTGTT 
ACTGGCTTACAAATTGGATTGAGACTATTCTTGTTCAGTCCTGGTATCCAATCACA 



GTGGCCACAAATT 
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Sequence ID 883 SEQ ID NO: 350 

TCATTTACATTAATACTCAAAACTGCTCGATTAAGCAGGTGCTGTTCTTATCGCCA 
TTTTGCATATGATGAGAAAGGGTAAGGTCACCCAGCTAGTATTTGGCTCACAGCAG 
GCCTTAAGACTTGGTTTGTGTGACTCATCAGTCCACGCTCCTAAAACCACTAAGTT 
GTTCTACCCTTTAATGTTGAATTAACATTGGATAGTGTTCAAGTTTANATGGGTGG 
GTGAGGGCCCAAGGACCTTTCAAACTCAGATCTCTTATTTAATAACCTGGTCCCAG 
ATCCATTCCTCTGTCGAAGAGGAAGTCATCCTTCAGTGGCTATTCATTGTGGGGTT 
AAGAGCGCAGACTATGAATTCAGTCTTTTTGGGTCCCAGTTTGCCAGACCTTGAGT 
GAGTGCCCCGAGTTTACTTACTTGTAAAGGTAGGTGGAGGTAATATAATTAAATAA 
AC T T AAAAAAC T AAT T AAAAAC AAAAC AAAT GAAC T AAGGT C T TAG GAT AT C T GGC 
GTCTATTTTGCGCCAAATCACATAATGTCTATTGTTGTGTGTTGGACTATAGGATT 
GTCCTTTAACAGGGAAGGGTTTATTTCTGTAATCAAGTCTGTCAATATTATGACCA 
TGTTGATAATAGCTACCTTTAATTGAGGGCTTCCATGTGCCAA 

Sequence ID 885 SEQ ID NO: 351 

T C AGTGGAAAAGGGCAGGTTGAATC AAGGT GAATC AAT CTGAAATTGAGC AC ACCT 
GCCTGCCATCGCTGTTCCTTCAACTGAGTGCTGCACATCATGGGCTCTGTCTGTGA 
GAGAAAAATCCCGGTGCTTGGTGTCCTTGCATGACATGGAGTTTTGCATGTAGATC 
AAT T T AAAAT G T AC CTCTTGTT T AC AT AAT T T GC AT AAT T T T AAAAGAT AAT GT T G 
C CAAAC T T T GG AAAT G T T AAT G T T CAN AC T G AAAAT C T C C AC T AC AT G T AAC T T T C 
TTCCTCTGGATCAGTGGCATGGCTTATAATCCCAGCCAGTGGTTTGAACTGTTCCA 
GTGTCAACTGCCATGTGCTCTGCTTCAAGGGGGAACTAGCCTTTTGTGAATTTTTT 
GTACATAAGTATTTGTTACAAATATTTTAGCAAATGCTTTCTATTTCTCTTGCTTG 
TGCATATCTTGGCTGGCGTTACAGAAAAATAGTGTAAACATTATTTCCTTACCGGG 
GAATGAGGGTTTT 

Sequence ID 887 SEQ ID NO: 352 

AGCACCTGGCACAGAGTAGTAGCTAACACAGATGTTAATTTTGCTGCGTCAAATGT 
TTTCACTTTGAATCTCTCTTGAGTATTGTTCTCCTTATTGATTACATGATGACATC 
CTGTTTTCTCTCCCT G AC CTTTACTGTTTGTTT AG AAAAAAAAAAAAAAAAA AAAA 
AAAAAA 

Sequence ID 889 SEQ ID NO: 353 

CAGAGAGCTTGTTCCCTCCCTCCCTGTGCATGCAAACAAGAGGGCATGGGAGCACA 
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CAGAGAGATGGCAGCCACCTACAAGCCAAGAGGAGAAGCCTCACAATCAAACTCTC 
GCTGCTGGCGAGAGTCTTGGACTCTGTCTTGGACTTCCAGCCTCCAGACTGTGAGA 
AACAAATTTCTGTTGTTTCAGCTTCTCAGTCTCTGGTGTTTTGTTATTGCAGCCTG 
AGAAC AC AGC T GT ACNAT T ATNAGGGAAAC AGAAAAC AC T GAT AC T T AAC AAT GC T 
AAT GC AAT TAT T TAT TTGCTTTT C AGT C T C T AC AAAAC GT T C T AAAAC AC T AAT C T 
AAATATTAACAGTAAAATATTTGCATAACTAATGGAAACTAAGAAATCATATGACC 
AATATTTCACTTATTGGTAATCTTACTCTACTGATTTCCCCCCAGACTGTGATTTT 
TGAACTTCCTTGCCTTTCTCCTGTCTTTCTGNGTTTATTCATGGAATTCCAGTTAT 
CTGGGCTTGAAATTGCAGGCTCTCCTAACTTAAGCAAAATCTGACAGATCAGCAAA 
ATGAGATAAATGTTTCTTTTTTCTTTCTGACTGCATTAAATCAGATACAACTCAGC 
ATT AAAAAG C T A T C T T T GN AAAA T GN T G G T AC T AA T AAA T TAGTCTTA 

Sequence ID 89Q SEQ ID NO: 354 

CCAGTTCCACATTCAGTGAAGTCATGAACTTGAAATTGGCCATGATCAAAAAGTAT 
TTAAATCACAGAAGTTGCAAATGCCACAAATCAAGGTCTTTTTCTCTTGGAGAACC 
TGTTAAACATTTACCAACTCACGACCGCCATGCACCCAATACTGCAATAGGTCTAT 
AGATGCAGATACTGTCTCCATGAATCTTATAGGCTAGAAAGGAAATAGATAAGTAG 
TCCTACCAGAAGAACATGATGAAGGCATTTGTGGTAAACAGAATGATGGCCCCCCA 
AAGATGTCCACATCCTAATCCCTGAAGCCTATGAATATACTACTTTACTTGGCAAA 
AGGGACTTTGCCACAGGTTTTTAATTAAGGACCTTGAAATAGAGAGATTATCCTGG 
ATAATCCAGATGGCCCCAGTGTAATCCCAAGGGTCCTCACAAAGGGTAGGAAGGAG 
AGCCAGAGTCAGAGAAGGAGACGTAGCAATGGAGGCAGAGGTCANAGAGAGATCTG 
CAGATGCTGCTGTGTTGGCTTTGAAAATGAGGAATGCAGGTGACCTCAANGNGCTA 
GATGATGCAAGGAAACAAATAATCTCCTATGAACCCTAGGATGGGCATTATTATGA 
GTCCTATTTTATAAACAAGGAACTGACNTCCAGAAAGATAAATGC 

Sequence ID 8-^ SEQ ID NO: 355 nt : 

626 

GGCAGAGGTTGCAGTGAACTGAGATCATGCCATTGCAATCCAGCCTGGGCAACANG 
AG T GAG AC T C C AT C T C AAAAAAAAAAAAAAAAAG AC AAG AG TNT C C AC T CTAAACA 
CTTNTATTCAACATAGTCCTGAAAGTCGTAGCCACAGCAATTTAACAAGATAAAGC 
AATAAAATGTATTCAAATAGAAAAAGAGGAAGTCAAATTATCTTCACTGGNGATAT 
AATTCTCTACCTGGGAAACTTCACCGAAAAAGATTTCACCAAAAGATTTCTAAGCC 
TAAATAATGACTTCAGCAAAGTCTCACCATACAAAATCAACATACACAAATGAGTA 
GCATTTCTGTGCACCAATAATATTCAAGCTGAGAAAAAAAGAACATGGTTCTATTT 
ACAATAGCTACAAACAAAAAAATATGTACCTAGTAATACATTAAATCAAGGNGGTA 
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AAATATCTNTACAACAAGAACTACAAAACTGCTGAAAAAAAATAGAGACACGCAAA 
TAAGTAAAAAGGCACTCCATGCTCATGAATTTAAAGAATCAATATAATTAAAATGT 
CCGNGCTGCCTAAAGCAACTTACAGATTAAAGGCTATTTCTCTCAAACTATAAATG 
CACCTTTTTA 

Sequence ID 8-^- SEQ ID NO: 356 nt : 

585 

GTCATTGCTGGGTGGCGCCAGCCCTCAGACTTGCCTCTTTGCAGTAGGAAGAAGGC 
CTCCCCACATACCTTCCCACACTCATCACCTTAAGCCAGACTCGGTGTCCAGTGAA 
TATGACCATCTCTTGCCCATTTTCTAATGAGTGTTTTCATTAATGAGTTATAAGAA 
TGTGGTGGGTAAATCTATGGGCTTTGAACTAGTGAATCAACTTGGTTTCAGAATCT 
GGC AC T GC T AC T T AC TAG T GAAT T T AAGC AAG T TAT T T C AC C T T T C AGAGT G T C AG 
T T C C C T C AT GC AT AC AAGGAAGAT AAAAAAT AAT G TNT ACNAAAG T AT T GGAGT AA 
TTAATACATGGAGAACTACATGTAAAGCGTTTAGCATGATGTCTGACATATTAAGC 
ATCCAATATTAGTNGCTTGCAGAATTATTAGTAAAAGAGATTGCTTCTGAAAGCCA 
TTCCAATTCTTAAATTTTATAATGCCACATTTGAGGTCACCTGAAGTCGTGTATAA 
CATGTGTACATTTTTGCGATTTATTTTTTCAATTCCCANATTAAAGGCATAGAGAT 
ATCCTAGCNANGGACTCCAAGTGTG 

Sequence ID 8^5- SEQ ID NO: 357 nt : 

560 

GTAATTGCAGCCTGGGCAACGGAGTGAGAGACTGTCTCAGGAAAAAAAAAAGAAAA 
AAAACTACTGAGGTAGTTGAATATATCCTCCATTCCCCATTTGTGGATTAGTTAGT 
AAATGGGGCATCTTAGGGTTTAAATATGTCCAGGGTCACTGAGGATCAGATCCTAG 
GGTTCCTTTGACTCAAGGCTTTTGTCTCAGCAAAACGTCACCTTCCAGCAGGAAGG 
CTTTCTCAGGCAAGTAGCAGGGTGGCTACTATGTATCGCTTCTTTATTTTTTCTTT 
T T T A A A A T A A TGCAGGCACCGTGCGCAT A A T T T A A A A A A TCAGTGCT A A A A C C C T T 
A A A A A A A A A A A GCTGTTCTCATCTCCTGTCTTTCTTTTTTTTTTCTTTTTATTTTT 
TTCTTTTATTATTATTATACTTTAAGTTTTAGGGTACATGTGCACAACGTGCAGGT 
TTGTTACATATGTATACATGTGCCATGTNGGTGAGCTGCACCCATTAACTCGTCAT 
TTAGCATTAGGTATATCTCCTAATGCTATCCCTCCCCCCTCCCCCCTTTTTTTTTT 

Sequence ID 896 SEQ ID NO: 358 

GGGAATGTCTTAGGCACTGGGACTGTAAGTGCAAAGACCCTGTGGCACAAGGGAAT 
G T T A A TTATCTACCTTTC AN A A AC T G G A AN AAG G CCTAGCCTAGAGCATT G A A A AC 
AAT AAGGGAAAGGAGGAGT AAGGC T GGANAGAT AGGAAT GGT T TAAAG TCTTTGTT 
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AAAAATTTTTTTAAAAAAATCTTTATCACAAGAAGAGGATTGGCNTGATCAAATTT 
GACTTTTAAAAANATTACTTGGGTTGGGCATGATCAAATACTACTTAGGGAGATTA 
GTTTANATGATAATGGCATTCTGGACCANAGTGGAGTCAGAGGTGAAAAGAGGTAG 
AT AT T CC ANAAT T GAGGGAT T T GT GAGGT GAAAT CAT T T GT T AC AGAT AT T AAAGG 
ATAAGGAGCTTTGTCAAAGGGGATCTTAAGTTTCTGGTATGGTAACTGGGTTAGAG 
AGCCCTGGAACATGACCAGCTTTAAGGGAAGAGAGCTTGAGCTCTGTTCTTGTTAA 
GCTCAGTTTGAGATCTTTGTGGAATCAAGTGGAGAGGTCTAAGCAGGGAACTGGCT 
TGGCTAGGCTGTAAAGATGAATCTGAGAGTCCCAAGAATATGGTAATTATTAATAA 
AAGCCTTAGGTANATGAAATTGTTTTGGG 

Sequence ID 8-^ SEQ ID NO: 359 nt : 

509 

GCAAATCTACACATTTGATTAAATGATAGGGAACTATGCACACACATAATACATAT 
AATGCTAGTTTCTTGGTTTTGATATTGTACCATAGTTATGTAAGATGTAACCATTG 
GGGGAAACTGGGTGAAGGCTACATGAGACCTCTCTGTACTTAATCTTTGCAACTTA 
TGTGAATCTATAATTATTCCAAAATAAAAAGTTTTAAAGAACCTAAGTATCCTTAT 
TACTGAGGGTCATCGTGCTAGACAGCAAGGTTGGGCCAGAGCTTCTAGTTATTTAA 
AATACTAAATACCAGCCTGGGCAACATAGCAAGACCCTGCCTCTACAAAAAGCAAA 
AAAATTAGCTGGGCATGGTGGTACATGCCTGTGGTCCTAGTTACTCTTGGAGGAGT 
CT GAGGT GGGGAGCTTGAGCCTAGGAGTTTGAGGCCGCAGTGAGCCTT GAT TGTGT 
C T C T G T AC T C C AG T C T G G G C C AC AG AG C AAG AC CCGGTCTCT AAAAA T AAA T AAA T 
AAATA 

Sequence ID 898 SEQ ID NO: 360 

ANTGCACTCCAGCTTGGTGACAGAGGGAGACTCCATNTTAAAAAAAAAAAAAAAAA 
AAAAAAAGGGAGTAGCTTGAAGCCACATAGTAGTTAGTGGTAAAGGCCACCCCTTT 
TCCCACAACTCACACCAGCACCACAAGCTAGCCTTTNTAATTTCCAAGCCAGTGCC 
CTTTCAACGCACACACCCCTGTGTCAGTTCCCTTTCTGCTGCAAGCTCTCTGGAGG 
CAGATACTGTTGAGTCCCTGGCCTGCCTATGAGAACGGCTCATGATCTCTATTTCT 
TCTGCTTAATGACCATCTCGAAGTAACAAGTTTAGCCTAAAATAAACTTGCTAAGT 
TAGCAAAGGAAGTCCTTAGCAGCCACCATTTCTCGATTCCTCCATCACCTCCCCTG 
CCCCTCAACTCCCTCATTTCTCCCAAGATATGGGCTCCAGGCTGGGCGCGGTGGCT 
CACGCCTATAATCCTAGCACTTTGGGAGGCTGAGGTGAGCAGATCACTGAGGTCAG 
GAGTTCG 



Sequence ID 899 SEQ ID NO: 361 



TCNTTCGGAACGCGCC 



- 241- 



Marked-Up Copy 



Sequence ID 900 SEQ ID NO: 362 

CTGGAGGGATGGGTAGGATTTTGACAAGAGTGGTTGAAGGTATTCTAATTCACTTA 
GTACCTACATGTGCGAGGCAGCATGAAGGCAAAAAAGCCTGGGGCATGTTCAGAGA 
ATAGCAAGTATTCTAGTTTGAGTGGCACCTGGTACGTATATAAGGGAATAGTAAAA 
GATCTGGCTGGAAAGGAAAAGTAGGGGCAGGTTACGAAGGACCTCTGAAAGTCAGA 
C TGTGGAAC TGGAAC T T T T AT C AGGAAGC AGT AGT T AGT T TTTT C A AG C A A A AG C T 
AATTAGAGTTGATATTTAGGAGGATGAATCTAACAGTTGTGTGCAAGGATGCCTTC 
AAAC T GAG T GAG AC TAG T AC T GG AG AC T GG T T AAG AG AC T AC AAC AAT AAC C T GAG 
T AAG AAT T AAT AC AG G C C T G AC C T AG TTTT GAG T GAG T AG GAT T G G AAAC AAG AG T 
T T T AGGT AT T AT AGGAT T T AT GC AT AT AAAAT GGAC T T GAC AGAAC T T GAAGAAAG 
AGAAAGT G T C AAAAGGAC AC AGAAAGT G AGGC AGG AT AT C T T AC AAT GT T AAAGG A 
AAGGAATAATAGAAGTTAC 



Sequence ID 903 SEQ ID NO: 363 

GGAAACATAAGCTTGTTTCAGTACACTCACGCTGTAGATTAATTCTGATATTACAT 
ATCTCCATCAGACTTTGTACCCTCTCTCTTCCATCCCTTACCCTTACCGATTAGGT 
TGGTATTACCTAAAAATCCATAGAAAATGTCCAGGTGAATTGCCTTATGCTTTCTA 
CCCCATAAGGTATAATT 

Sequence ID 904 SEQ ID NO: 364 

CTCTGTGGTGTGAGAACACAGTGGGTGACCAAGGCTTTCCAGATGAACCCAAGGAA 
AGTGAAAAAGCTGATGCTAATAACCAGACAACAGAACCTCAGCTTAAGAAAGGCAG 
CCAAGTGGAGGCACTCTTCAGTTATGAGGCTACCCAACCAGAGGACCTGGAGTTTC 
AGGAAGGGGAT AT AAT C C T GG T GT T AT C AAAGG T G AAT GAAGAAT GGC T GGAAGGG 
GAG T G C AAAG G G AAG GTGGGCATTTTCCCC AAAG TTTTTGTT G AAG AC T G C G C AAC 
T AC AGAT T T GG AAAGC AC T C GGAGAGAAGT C T AGG AT G T T T C AC AAAC T AC AAAGC 
TGAAGAAAATGAAGCCCTATTACTTGTTTGTAAGATTTAGCACCCTTCTGCTGTAT 
ACTGTACTGAGACATTACAGTTTGGAAGTGTTAACTATTTATTCCCTGTTAAAATT 
TAACCTACTAGACAATGATGTGAGTACCCAGGATGATTTCCTGGGGCACAGTGGGT 
GAGGAGATGGGGACAGGTGAATGGAGGAGTTAGGGGAGAGGAAAAGTGGATGGAAG 
TGTCTGGAAAGGGCACCAAAAAAGTCTTCCAGGTCTGATCCTGTTTCTTGCTCTGA 
GTGCTAGCTACCACTGTGTCACACTGTAACATN 
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Sequence ID »^& SEQ ID NO: 365 nt : 

655 

CTCAGCTCTTGCCTGGTCACCTTGTGGCTTTTACCATCCTCATCCCCTGTGCCACC 
CACATCCTGCCACTTCTGCATGGAGTTGGGGTGGGGCCATTGGAGAAAAGAGGTTA 
AACAAGCAGTAATTTACTTGAGTACAGTCTTTGAGCCAATGAAATGCCAGTCATCA 
TTTCCCAGGGGTACTTGTCATCTTGTCAACAACCCGCTGATAATGCTCCTTCAATG 
TGAATAGCAAAAGTAGGGAGAGACGCTGAATGAAGAAGATGCCTACCCCTCAGGAA 
GACTGCTGTCCGCCTCCAGGCCTGCATGCACACACCCATGCCCACCTGCACCCCCA 
GCACCACGCCCACACTCACTCGCACACACCCACATGCCAGTGTTTTGGGGTTGGCA 
GCCTGGACACTGCTGAGGCAAACACAAGTCATCAAGCATAATTCTCATTCTCTCCT 
TCTGTCTCTGTTTTAGTTACAGGAATTTGGTCAGTTTAGAGGATTTAATAAGTCCG 
TGGAAAATTTGTTTCTGTCTCTTGCTACCCACGTGAAAAGTAAGTGCATGCTTCAT 
GATGTGTTTTCCCACTACCTTCCAGGCCAGCCGAGCCCACTGGCCANGGCCTGGCC 
CGGTGACCTCGGTTGACACTGTCCTCANGCCACTCACTT 

Sequence ID 906 SEQ ID NO: 366 

CAGAATTTCATGTTTATGCTGCACAAGGCCTGTATTTTATAATGGTGGCTCTTTTG 
GACGATGACTTCCTCGATGGTGAAACTTCCAGTAATCTCCCTCATCATACTGAAAT 
GAT AT C AG TAT AT CAT C AG AAC AC C AT GG AGC T T G T CAT T T G AGGG AC AC AGC T T G 
CTTGTGTGCTTGGGAAAGAAGAGGTTTAGCATGGTTTCAGGTCAGTGATGAGTCCA 
AT GAT C T C T GC AAG T T C C C T T AGC T C T G AN AAT T C T GAT G T CAT AT GC AC T T C T GC 
CGCCAGAGTTGCTGCTTACTGGATGCGT A AG A AG AAA AG A A A A A A A A A A A A AAA 

Sequence ID »^7- SEQ ID NO: 367 nt : 

582 

CTTCCATTGGGGGTAAAGATCAAACTTTAGGCGAGCCAGGTCTGTATCTCCATTCC 
TGTCTCTGACTGCTTCCCTGTAGGGATTGTCTGCAAGCGCACACCTGCATTTTCTT 
GTCCACAAGTCTATGCTCTAACTCTGTCACCTGCATGGCTGCAAATTAGCTTCCTT 
CTTCCTGCCCTCTTCTCTCTAGCTTGGATTTTGAATTTGAATGGCAGGCATGGGAT 
GTCCGTGTGTGTGTACTGCTGATGTGTACAGCCGCTTGTTAGCGCTCTCATTGTCT 
TCAAATGTAAGTCATTTTGGCTGGGTGCGGTGGCTCATGCGTATAATCCCACGCTT 
TGGGAGGCTGAGGTGAGCTGATCATTTGAGGTTAGGAGTTCGAGACCAGCCTGGCC 
AACATGGCAAAACTCCATCTCTACCAAAAATACAAAAATTAGCTGGGTATGGTAGT 
GCACGCCTGTAATCCCAGCTACTTGGAATGCTGAAGCAGGAGAATTGCCTGAACCC 
ANGAGGCGGAGGTTGCGGTGAGCCAAGATCACGCCACTGCACTCCAACCTGGGTGA 
C AG AGC AAGGC T G T G T C T C AAA 
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Sequence ID 9Q8 SEQ ID NO: 368 

ACC T GAC T T C AAAC T AT AC T ACGAGGC T AC AG T AAT C AAAAC AGC AT GG T AC T AG T 
AC AAAAAC AGACC AAT GG AAC AGAAT AG AGAT C T C AGAAAT AAAAC T GC AC AT C T A 
C AAC CAT C T GAT C T T C AAC AAAC C T GAC AAAAC GAGC AAT GGGGAAAGGAT T CC C T 
ATTTAATAAATGGTGCTGGGAGAACTGGCT-AGCCATGTGCAGAAAATTGAAACTG 
GACCCCTTCCTTACACCTTATACAAAAATTAACTCAAGATGGATTAAAGACTTAAA 
TGTAGAACCCAAAACGATAAAAACCCTAGAAGAAAATCTAGGCAATATCATTAAGG 
ACATAGACATGGGCAAAAATTTCATGATGAAAACATCAAAAGCAATGGCAACAAAA 
GCAGAAACTGACAAATGGGCTTCTGCACAGCAAAAGAAACTATCGTCAGAGTGAAC 
AGAC AAC C T AC AGAAT GGGAGAC AG T T T T T GC AAT C T AT CC AT C T GAC AAAAGT C T 
AATATCCAGAATCTACAAGGAATTTAA 

Sequence ID 910 SEQ ID NO: 369 

CAAAAAACAAGAATTACCCGGGCTTGGTGGTGCATGTCTGTAGTCCTATCTACTCA 
GGAGGCTGAGGCTGAAGGATCACTTGAGCCCAGGAGTTTGAGGCTGCAGTGAGTGA 
GCC AT GAT C AT GCC AGT GT AC TCCAGCCTTGGC AGAC T GAGC AAAAC TTGGTCCCT 
CGCAAAATGTTGAAGCCCAGTTTTCACTATTAACCTGTATTTCAGTTTCCCCATGC 
TAACTTTGAAACACTGGGGCTGGCCTGAGGGTATAAAGGCTTATTCAAACTCAGTA 
A T T T AAAC T T A A A A T C C T A AG G A AC T T C A A A A AG T G T A A TCTAGTCC AAA T G G G G C 
ATCAATTCTAAAGCATTTGCTTGTTTGAGCAGATTTTCTGTGTCTGAGGTATATAG 
ATAACTTATCTTTTTATGACTAAATCCAAGTCCTTAGTTCCTGTTGGAATTCAAAA 
TCATATTTAAAAATTGATGCTTTGTTCTATAATTAATGCTTTGATTGTATAAATAA 
TAAGTATTCTTCCAAATCCCTTTTTACAGATGATGATTCTGATACCGAGACGTCAA 
AT GAC T T GCC AAAATTTGC AGAT GGAATCAAGGCCNGAAAC AGAAAT CAGAACT AC 
CTGGNTCCCAGTCCTGTNCTTAAAATTCTAACTCGAC 

Sequence ID — 911 SEQ ID NO: 370 nt : 

595 

GAGGGTGT AG A AG AG A AG A AG A AG GAGGTTCCTGCTGTGCC AN A A AC C C T T AAG A A 
A A AG C G A AG G A A TTTCGCAGAGCT G A AG A T C A AG C G C C T GAG A A AG A AG T T T G C C C 
AAAAGATGCTTCGAAAGGCAAGGAGGAAGCTTATCTATGAAAAANCAAAGCACTAT 
CACAAGGAATATAGGCAGATGTACAAANCTGAAATTCGAATGGCGAGGATGGCAAG 
AAAAGCTGGCAACTTCTATGTACCTGCAGAACCCAAATTGGCGTTTGTCATCAGAA 
TCAGAGGTATCAATGGAGTGAGCCCAAAGGTTCGAAAGGTGTTGCAGCTTCTTCGC 
CTTCGTCAAATCTTCAATGGAACCTTTGTGAAGCTCAACAAGGCTTCGATTAACAT 
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GCTGAGGATTGTAGAGCCATATATTGCATGGGGGTACCCCAATCTGAAGTCAGTAA 
ATGAACTAATCTACAAGCGTGGTTATGGCAAAATCAATAAGAAGCGAATTGCTTTG 
ACAGATAACGCTTTGATTGCTCGATCTCTTGGTAAATACNGCATCATCTGCATGGA 
GG AT TTGATTCAT GAG AT C T AT AC T G T TGGAAAAC 

Sequence ID » ^SEQ ID NO: 371 nt : 

651 

CATTTCCAGAGTTTATGTGAATTGAATTGAACTATGGTTTTATGTTACTGTCAGTA 
GAAT G AAG T AC G AAT AT T T G AAAAAT AC AC C T T C AAC T T CAAAG T GAT T C T T G AC A 
AAAATTATAAGGAATCATTTTGGACACATTTTCTGGTAGAGCCTTGTAAAAATTAA 
AACCAAGTGTTGTTTTCAAGAAGAACTGTAATACATAATCAGGAATTTGAGTAGGG 
AGATTATTTTGTTATTTAAAATTAAAGTGGCTGTGTAGTTTTAACTTTAGTATTGC 
AGGT AGAG T AAGC T T AC AT GAT AAC AAAAAT CTTGGTCT TAG T GAC T T AAT GAT T C 
TGATATTTATTGATTGATTGGTTATCATTCCAAATATTTTAAAAGATAATAGCTGG 
CTGGGTGCGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCAGGACGGGCG 
GATCACGAGGTCAGGAGATCAAGACCATCCTGGCTAACACGGTGAAACCCCGTCTC 
TACT AAAAAT CAAAAAATTAGCCGGGTGTAGTGGCGGGCACCTGTAGTCCCAGCT A 
CTCAGGAGGCTGAGGCAGGAGAATGGCATGAACCTGGGAGGCGGAGCTTGCAGTGA 
GCTGAAATCGTGCCACTGCCTCCACCTGGCGACAA 

Sequence ID 913 SEQ ID NO: 372 

GTGAGGTGGGGACTTCATTCATTGTCCTATTTCTATCTCCACTTTGTGCCTGGAGA 
GCTTTCAGGGGAGGTGGAGGAGGAGGGTCTGCCAAGCTACTGCAACATCTGTCACC 
CACTATACCCAGTTACTTGGGGGAGGACAGACACTGTGGTGTCATTAAAGTTGTTT 
GAACCAAAGTGGCGGCTGCATCTTTGTCCCGATGCTAGCCGTGCCGGTCTCCCATC 
ATCCGCTCGCCCTCCTTTNCCCTGGGCTGCGCCCACTTGTCTTCCTGGATATTTGG 
GGGTGACTCGCCATGCTTGGCACCCTCTGCTTCCTGGTGCTGCTCTGACTCGAAGA 
CGGGACAGTCCCTGGTGCACATCCAGGGAAGAGGAGTGTCGGTAGTTCTTGCAGTA 
GGCACTTTATCAGGACCTGACCTGTTGCTGGGTGATTTTAGTCTCTACAAACAGAA 
AGCGTTTCAAAGCGTCAGCTGTGGGAGCAGAGTGACCCTTTGCTGATGCTGGGGGG 
AGGGGATCTAAATCCTCATTTATCTCT 

Sequence ID 914 SEQ ID NO: 373 

GGCGCCTGCTGGAGGAGGAGAGAGCTCTGCTGGCATGAGCCACAGTTTCTTGACTG 
GAGGCCATCAACCCTCTTGGTTGAGGCCTTGTTCTGAGCCCTGACATGTGCTTGGG 
CACTGGTGGGCCTGGGCTTCTGAGGTGGCCTCCTGCCCTGATCAGGGACCCTCCCC 
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GCTTTCCTGGGCCTCTCAGTTGAACAAAGCAGCAAAACAAAGGCAGTTTTATATGA 
AAGATTANAAGCCTGGAATAATCAGGCTTTTTAAATGATGTAATTCCCACTGTAAT 
AGCATAGGGATTTTGGAAGCAGCTGCTGGTGGCTTGGGACATCAGTGGGGCCAAGG 
GTTCTCTGTCCCTGGTTCAACTGTGATTTGGCTTTCCCGTGTCTTTCCTGGTGATG 
CCTTGTTTGGGGTTCTGTGGGTTTGGGTGGGAAGAGGGCCATCTGCCTGAATGTAA 
CCTGCTAGCTCTCCGAAGCCCTGCGGGCCTGCTTGTGTGAACCGTGTGGACAGTGG 
TGGCCGCGCTGTGCCTGCTCGTGTTGCCTACATGTCCCTGGCTGTTGAGGCGCTGC 
TTTAACCTGCACCCCTNCCTTG-CTCATANATGCTCCTTTTGA 

Sequence ID — 915 SEQ ID NO: 374 nt : 

230 

TTTGAGACCAGCCTAGCC A AC A T G G T G A A AC CCCATCTCTACT A A A A A T AC A A A A A 
TTAGCCGGGCGTGGCGGCACATGCCTATAATCCCACTTACTTGGGAGGCTGANGTA 
GGAGAATCGCTTGAACCCANANAGGCAGAGTTTGCAGTGAGCCGAGATTGTGCCAT 
TGCACTCCAGCCTGGGCGACAGAGCGAGACTCCATCTAAAANAAAATAAATGAATA 
AAATAA 

Sequence ID 917 SEQ ID NO: 375 

NNC AG AT TTTTTTTTTTTTTT C AGNG T T AG AC CAT C T T T C AAT T C C T GG AAC AAAC 
TTAACTTTCCATGATATGTATTTTTTATACATTGCTGGATTTTATTTGCTAATATT 
T T AC T T AGGAT T T AAT T T T C T AAGTNGACC T AT AAT TN T C C T GT AT AAAAT T GC AT 
TTGTCACATTTTAGTATCAAGGTTGTCCTANCNCCATGAAATGGATTTANAATGGT 
TTATGTAANATAAAGTACATTTCTTCTAAAGGTTTGNGTGGATTAACTTTCAAATC 
T GCC ANAGNGNGT T TTTTTCCTTTTTTTTTTTTTTT CAT T TNAAGGGAGNGC AAGT 
ANCTTTTCAAATNCTGATTTAATTTTTAAAATATTTNCAAGTNTNTTTANAGTTTT 
TATTTNTTNTNGAANGTTAACATTTTTATANAAAANGGTNTTATCTTTTTAAATTC 
T T T G AC AT C AG T T T C T T CAN AAT TCCTTCTTT T AA 

Sequence ID 926 SEQ ID NO: 376 

GTCATATCTCTTCCCAGGGAAAGCAGGAGCCCTTCTGGAGCCCTTCAGCAGGGTCA 
GGGCCCCTCGTCTTCCCCTCCTTTCCCAGAGCCATCTTCCCAGTCCACCATCCCCA 
TCGTGGGCATTGTTGCTGGCCTGGCTGTCCTAGCAGTTGTGGTCATCGGAGCTGTG 
GTCGCTACTGTGATGTGTAGGAGGAAGAGCTCAGGTAGGGAAGGGGTGAGGGGTGG 
GGTCTGGGTTTTCTTGTCCCACTGGGGGTTTCAAGCCCCAGGTAGAAGTGTTCCCT 
GCCTCATTACTGGGAAGCAGCATCCACACAGGGGCTAACGCAGCCTGGGACCCTGT 
GTGCCAGCACTTACTCTTTTGTGCAGCACATGTGACAATGAAGGACGGATGTATCA 
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CCTTGATGGTTGTGGTGTTGGGGTCCTGATTTCAGCATTCATGAGTCAGGGGAAGG 
TCCCTGCTAAGGACAGACCTTAGGAGGGCAGTTGGTCCAGGACCCACACTTGCTTT 
CCTCGTGTTTCCTGATCCTGCCTTGGGTCTGTAG 

Sequence ID 938 SEQ ID NO: 377 

TGGCCATCCTTTTCCCCCCAAACACACCCCCTTAACCTATCTCTTGGGACTTAGCC 
CGACCCTCCCTCTCATTTCCCATTAAGTCTGAGAGGCAAGAGCTAGGTTAGGCAAG 
GAGGTGGTTGGCCAGAGATGGGGAACAGCCAGGTGCCCCAGTCCTCTGATTTTTCC 
TCCATCCTGCTTACCACCTCCCTGGGTACTTACAGCCTTCTCTTGGGAACAGCCGG 
GGCCAGGACTGGGTCACCTATGAGCTGAATCAGCATCTCCTCCTGAGTCCCAGGGC 
CCCTGCAGTTCCCAGTCTCTTCTGTCCTGCAGCCCTTGCCTCTTTCCCACAGGTTC 
CACTTTATATCCACCTTTTCCTTTTGTTCAATTTTTATTTTTATTTTTTTTATTAT 
TAAAT GATGTGGTCTATG G A A A A A A A A A T A A A A A T C T G AC T T AGT T T T 

Sequence ID - 939 SEQ ID NO: 378 nt : 

513 

GGAACCCAGTGTATTACCTGCTGGAACCAAGGAAACTAACAATGTAGGTTACTAGT 
GAATACCCCAATGGTTTCTCCAATTATGCCCATGCCACCAAAACAATAAAACAAAA 
T T C T C T AAC AC T GC AAAG AGT GAGC C AT GCC T GT T AAC AC T G T AAAGAAT GT AAC A 
TGTGGGGGACACACAGGGGCAGATGGGATGGTTTAGTTTAGGATTTTATTAGTGCA 
TGCCCTACCCTCTGGGGGAACGTCCCATCTGAGGTTTTCTTCTCGGTGGGGGGATT 
TAACTTCTGTCCTAGGGAAAACAGTGTCTGATGAGGAGTGTTTCCAACACAGGCTA 
CATGAATTCCCCTATACCAGTGCGAAAGCAGCCAGGAGTCCCCGTTGGAAAAGAAC 
AATGCCACTCTCTTTTATGTATCTTGGTTCTGCAACTCATTTGTTGTAAGTAGGGT 
TAATCGAGTATCAGGTTCACAGTATCCTGCCCTTATTATTTTATGATTCACTGACT 
CAAGTTCCA 

Sequence ID 947 SEQ ID NO: 379 

GAG AG T G A A A A A A T T C T G G T AC A A A T T G G G A A A TTAGTATAT A AC A AC A T AG T G T T 
AAATTCAATGGGAAAAGTTTAATAAGAGGATTTGGTATCAACTGGCTGTCCAAAGA 
TAAAAATGGACCGTCCTATCACATACAAAATTGTTTTTTAGATAAAGATTTAAATA 
CAGGCACTCCTTCATTTGCGTGGTGCACCTTGAGGTGTTGCAGAAATGATGAGAGC 
TGAAACTGCAAAGCAATTTTAATACTTTATCTGTTGGAAATCTTATAGTTTTCCTG 
TGACCGTTAAAATTTTCATTAAACTATTAAAAACACCCATGACTGGTCACAAATGT 
AT T GGGAAAT GGAAAAGAAT T AAT AC AC T AAAAAT AC AAAAAAT AGAAAAT AT T T A 
AAATTATCTAAAAATTTGAAACATTAGAAAAATTGAGAACTAGGCAGGGCGTGGTG 
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GCTCACATCTGTAATTTTAGCCCTTTGGGAGGCTGANGCAGGTGGATCACCTGANG 
TCAGGAGTTCGAGACCAGCCTGCCAACGTGGGGAAACCCCGTCTCTACTGAAAATA 
CAAAAATTANCCGGGCATGGTGGCACAAGCCTGTAATNCTTGCTNACCAGGANGCT 
G AGGC AGGAGAAT C AC T T GAACC C ANGANG 

Sequence ID 949 SEQ ID NO: 380 

GTTTCACATGAGAAGGTAGTATTATGTACAGTGACCTTGTTTAAAGTGTCNGTTTA 
ATGTTACCACTAAGGCCCTGCCCCAGCTTTATCACCTGAGCACTAACAAGTGCTGT 
GTGGAGTTCAGTCCATGCTGGTAACTNTTGAGTATTCAGTGGGTCTTTTAACAATT 
ACCACCGTGGAGGANANAGCAAGGAAGAGAAATGCTGTGATCTTTTNCTGTTTTTA 
AT T AGNGAAAG AGGGAT T ANAT T AAAC AAAT G T T AC AG AGNT GT G AC TN T GAT CC C 
CCAGNGGTAAGCAATAATTGTANAGACTGGATTTNANAAGCCCTGAGAGTTTATTT 
TCAACCTATNTATTATAGNNCAATCC 

Sequence ID 1028 SEQ ID NO: 381 

ACAAGGCTTGGGGGCTGGACTCCCTCTACTGCCTCTGGCCATACCCCCTCCTGGAG 
ATGGGGTCAAGGCACCAGGACTGA 

Sequence ID - 1056 SEQ ID NO: 382 nt : 

435 

TCGCTTGTAAAGCCTGAGACAGCTGCCTGTGTGGGACTGAGATGCAGGATTTCTTC 
ACACCTCTCCTTTGTGACTTCAAGAGCCTCTGGCATCTCTTTCTGCAAAGGCATCT 
GAATGTGTCTGCGTTCCTGTTAGCATAATGTGAGGAGGTGGAGAGACAGCCCACCC 
CCGTGTCCACCGTGACCCCTGTCCCCACACTGACCTGTGTTCCCTCCCCGATCATC 
TTTCCTGTTCCAGAGAAGTGGGCTGGATGTCTCCATCTCTGTCTCAACTTCATGGT 
GCGCTGAGCTGCAACTTCTTACTTCCCTAATGAAGTTAAGAACCTGAATATAAATT 
TGTTTTCTCAAATATTTGCTATGAAGGGTTGATGGATTAATTAAATAAGTCAATTC 
C T GGAAG T T G AG AG AG C AAAT AAAG AC C T G AG AAC C T T C C AG A 

Sequence ID 1071 SEQ ID NO: 383 

NGATATAGTNCCGCATGGGAAAGATGANCAGGTATAACCNAGCNTNATATAGCAAG 
GACTAACCCCCCTGCCTTCTGCATAATGAATTAACTAGAAATAACTTNGCAAGGAG 
A G C C AAAG C T AAG AC C C C N G A A AC CAGACGAGCTACCT A AG A AC AGN T A A A AG AG C 
ACACCCGTCTATGTAGCAAAATAGTGGGAAGATTTATAGGTAGAGGCGACAAACCT 
ACCGAGCCTGGTGATAGCTGGTTGTCCAAGATAGAATCTTAGTTCAACTTTAAATT 
NGCCCACAGAACCCTCT AAAT CCCCTTGT AAAT TT AAC TGTTAGTCCAAAGAGGAA 
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CAGCTCTTTGGACACTAGGAAAAAACCTTGTAGAGAGAGTAAAAAATTTAACACCC 
ATAGTAGGCCTAAAAGCAGCCACCAATTAAGAAAGCGTTCAAGCTCAACACCCACT 
ACCTAAAAAATCCCAAACATATAACTGAACTCCTNACACCCAATTGGACCAATCTA 
TCACCCTATAGAAGAACTAATGTTAGTATAAGTAACATGAAAACATTCTCCTCCGC 
ATAAGCCTGCN 

Sequence ID ■ — 1074 SEQ ID NO: 384 nt : 

689 

GGGAGGCGGAGGCTGCAGTGAGCTGAGATCGTGCCACTTCATTCCAGCCTGGGCAA 
CAAAG C GAAAC T C T G T C T C AAAAAAAAAAAAAAAAAAAAT T T G T T G AC T G T T G T AA 
TTTAAAGCTTGTCATTTTTTATTTAGTAATAACACTCATTAGTGTAGTATCTATGA 
TGAACCAGGTTCTGCACAAAGTACCTTATGTTCATGGCCTCATATCGTCTTCTCCA 
AAAC T C T GC AAG AT AGG AT T CAT C AC C AC T T AT AGGG AG AG AT C T G AAAG T T T AAA 
ATTGTACCCAAGGTCACACAGCTGGTAAGTGCCAGAGCTGGGATTCCGTAGGGTGT 
TCANAGTGCCTCTCCTGCCGTAGGCTTATCACAAAAAGTCAAAGTTTGGTCATAAT 
AAAGCCTGAAGTTTGGCAGGATTTAAAAATAGTCACCANACTTTTGAGTTGGAGCA 
TCCCACCTCACTGCTGTTCACCTTCTGTGGCAGGGAGAGTCATCATTTCCATTTCA 
GCTTGTGGAATATCTTGTCATTAACATTCTCATGCAAAAGCCATTTTATGGTGCCC 
AAT G AANAT GG T T AAGC T AC T GCC C C AAGCC TNT GGAAGCC T T C C T AAT T T T GGAC 
TTGCACTATGCAAATTGNATAATATTTTCTCTACCCTAAGCCAAATATTTTCTTCA 
CTTTTCATTCATTCTAC 

Sequence ID 1081 SEQ ID NO: 385 

CGCCGCCGCGCCGCCGTCGCTCTCCAACGCCAGCGCCGCCTCTCGCTCGCCGAGCT 
CCAGCCGAAGGAGAAGGGGGGTAAGTAAGGAGGTCTCTGTACCATGGCTCGTACAA 
AGCAGACTGCCCGCAAATCGACCGGTGGTAAAGCACCCAGGAAGCAACTGGCTACA 
AAAGCCGCTCGCAAGAGTGCGCCCTCTACTGGAGGGGTGAAGAAACCTCATCGTTA 
CAGGCCTGGTACTGTGGCGCTCCGTGAAATTAGACGTTATCAGAAGTCCACTGAAC 
TTCT GAT TCGC AAAC TTCCCTTCCAGCGTCTGGTGCGAGAAATTGCTCAGGACTTT 
AAAACAGATCTGCGCTTCCAGAGCGCANCTATCGGTGCTTTGCAGGAGGCAAGTGA 
GGCCTATCTGGTTGGCCTTTTTGAAGACACCAACCTGTGTGCTATCCATGCCAAAC 
GTGTAACAATTATGCCAAAAGACATCCAGCTAGCACGCCGCATACGTGGAGAACGT 
GCTTAAGAATCCACTATGATGGGAAACATTTCATTCTC 



Sequence ID 1Q83 SEQ ID NO: 386 

198 



nt : 
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GCGCGTCGACTTTGTTTAGACATTGAATGACTTTGTTAAAGGCACAATTAATCACA 
TTGGTTGTACTCT GNNG AC AGC C T T C T T T AAAAAAAAAAT AAAC AAT T TAAAACAA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
A A A A A A A A A A A A A A A A A A A A AN T T T TAACC 

Sequence ID 1084 SEQ ID NO: 387 nt : 

198 

GCGCGTCGACTTTGTTTAGACATTGAATGACTTTGTTAAAGGCACAATTAATCACA 
TTGGTTGTACTCT GNNG AC AGC C T T C T T T AAAAAAAAAAT AAAC AAT T TAAAACAA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
A A A A A A A A A A A A A A A A A A A A AN T T T TAACC 

Sequence ID 1Q99 SEQ ID NO: 388 nt : 

561 

TGCATGCTTGTGGATTGGAAAAACTTTGGAGACTGATTACTTTTCATTATATATGT 
GTCACAGTGAAACAGCTTTTATGTGTCATGTAAGATTACTGCTTGCCTCTCTAAGG 
AAGGTCGTGACTGTTTAAATAGACGGGCAAGGTGGAACCTTTTGAAAGATGAGCTT 
TTGAATATAAGTTGTCTGCTAGATCATGGTTTGTATTGAACTAACAAGGTTTGCAG 
ATCTGCTGACTTATATAAAGCTTTTTGATTCCTACTAAGCTTTAAGATTTAAAAAA 
TGTTCAATGTTGAAATTTCTGTGGGGCTCTATTTTTGCTTTGGCTTTCTGGTGAGA 
GAG T GAG G AAG C AT TCTTTCCTT C AC T AAG TTTGTCTTTCTTGTCTTCTG GAT AG A 
TTGATTTTAAGAGACTAAGGGAATTTACAAACTAAAGATTTTAGTCATCTGGTGGA 
AAAGGAGACTTTAAGATTGTTTAGGGCTGGGCGGGGTGACTCACATCTGTAATCCC 
AGCACTTTGGGAGGCCAAGGCAGGCAGAACACTTGAAGGAGTTCAAGACCAGCGTG 
G 

Sequence ID 1109 SEQ ID NO: 389 

TTTGNCGGTNTTGGANNNNNANAANTTTCTTCCANNCNTNACNTNTTGGTGGNCTA 
AATTAANATGGNTTTNGNGGGTTCNTTNCTNNNTNNNNCATGGGANANAATTNATT 
NTCNTNCNNNTTCCTTNNCCCTNAANCTACCTTCCCCCNATTTTCTCCCCTNTTCN 
TNAATTANCATCCTCTCCNCNTANNTCNANACNTTAATGGCAANACTATCTAATAN 
CNANNATAANANCTCCTGTNNNCCACATNTCTTATTNNNCGCNNCANGTTNCANNC 
CCNCAGAGTNAACTCATCCTCNNCNNAANTTCATATCGTGNNCTNTNNNCNNTNGC 
GCGANATATTAANNANACCNGTANNTNNNANACANNANNTNNGNAANAANCCTTCT 
NANNTTTTAGCNTCNNGCNNTAACNNNNNTCTTNGTGNNNNCNCAGCTTTCNCNNC 
ATNATNCTNCNNCGAANTNTCANNCNTCTCCNCTTNAATGNNTTCCCATGNATTAA 
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NTNCCTCGNNNANAGCACTATCGTNNNNGAGNNNATTATNGNCNNTTTACNTCATG 
TGGTCCANTNNCGTTNGNCGCNNNNAATNTTCGTNNNNCNN 

Sequence ID 1118 SEQ ID NO: 390 

GGATTTTAGAGGAAGGCGCTNGGTTACATTGGAGAACTGGAGTGGTCTGGAGTTCC 
ACGGTGTAGTGGACCAGAGGCCACCTCTCCTGGGCTTCTCAGTGTCTCGCCGGCGG 
GGTTCGGCCTGAGCTGGATTGACATAGCCCTTGGCGGATTTAAACAACCTAAACAT 
TAAGCAGTACAGCTGCCTCAAACCTTTGGGATTTTCAGAATGACTGACACTGCCGA 
AGCTGTTCCAAAGTTTGAAGAGATGTTTGCTAGTAGATTCACAGAAAATGACAAGG 
AGTATCAGGAATACCTGAAACGCCCTCCTGAGTCTCCTCCAATTGTTGAGGAATGG 
AATAGCANAGCTGGTGGGAACCAAAGAAACAGAGGCAATCGGTTGCAAGACAACAG 
ACAGTTCAGAGGCAGGGACAACAGATGGGGGTGGCCAAGTGACAATCGATCCAATC 
AGTGGCATGGACGATCCTGGGGTAACAACTACCCGCAACACAGACAAGAACCTTAC 
TATCCCCAGCAATATGGACATTATGGTTACAACCAGCGGCCTCCTTACGGTTACTA 
CTGATAGAAATGTTGGCAGCTTTTAGTAAAAGCATTTACTCTGTTACCATGAGAAA 

Sequence ID 1125 SEQ ID NO: 391 

NGACTGGCTCCCGAAAAGAAGGGTGGCGAGAANAAAAAGGGCCGTTCTGCCATGGA 
CGAAGTGGTAACCCGCGAATACACCATCAACATTNACAAGCGCATCCATGGAGTGG 
GC T T C AAG AANC G T GC AC C T C GGGC AC T C AAAG AG AT T C GG AAAT T T GC C AT G AAG 
GAGATGGGAACTCCATATGTGCGCATTGACACCAGGCTCAACAAANCTGTCTGGGC 
CAAAGGAATAAGGAATGTGCCATACCGAATCCGTGTGCGGCTGTCCANAAAACGTA 
ATGAGGATGAAGATTCACCAAATAAGCTNTATACTTTGGTTACCTATGTACCTGTT 
ACCACTTTCAAAAATCTACAGACAGTCAATGTGGATGANAACNAATCGCTGATCGT 
CAGATCAAANAAANT 

Sequence ID 1139 SEQ ID NO: 392 nt : 

503 

CAGCACTGCCAGTGGAGATGGGCGTCACTACTGCTACCCTCATTTCACCTGCGCTG 
TGGACACTGAGAACATCCGCCGTGTGTTCAACGACTGCCGTGACATCATTCAGCGC 
ATGCACCTTCGTCAGTACGAGCTGCTCTAAGAAGGGAACCCCCAAATTTAATTAAA 
GCCTTAAGCACAATTAATTAAAAGTGAAACGTAATTGTACAAGCAGTTAATCACCC 
ACCATAGGGCATGATTAACAAAGCAACCTTTCCCTTCCCCCGAGTGATTTTGCGAA 
ACCCCCTTTTCCCTTCAGCTTGCTTAGATGTTCCAAATTTAGAAAGCTTAAGGCGG 
C C T AC AG AAAAAGG AAAAAAGGC C AC AAAAG TTCCCTCT C AC T T T C AG T AAAAAT A 
AAT AAAAC AGC AGCAGCAAAC AAAT AAAATG AAAT AAAAGAAAC AAAT G AAAT AAA 
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TAT T GT GT T GT GC AGC AT T AAAAAAAAT C AAAAT AAAAAT T AAAT GT GAGC AAAG 

Sequence ID — 1148 SEQ ID NO: 393 nt : 

587 

TGAAAAATAAAGTTTTTATGTATATTCTACATATGTATATGTTGGTAGAAAGCAAA 
AACGCTAGGTAAAAATAAATGTAATACAATTTTAGCTATGAACCAAAAAACCATTT 
GTGGTGTGGATGCAAGAAAGTCTGGATGGGTGCAGAGTTCTCCATGTTTCACTTCT 
GACATTTGAAAATACGCAGTT T GCATTT GAT ACGTC AAAT GT TAT TTTTAAGAAAA 
CCAATAAAATCATTAAAACCGAAAAGGCAGTTTTGCTTGTTTTTACCTTAGTTGGA 
GTTATCTGCAATTGCCGTATTAGTGTTTTAAGGAACTTGTAAGTAAGCTCCTTAGT 
C C C C T T TAG AGC T AC G AAAC AT G T C AAT T T T AC T T T T C T C C AGC T T T T T GG AAT C T 
TATCTAAATTACCATGTAGAGTTCTGCATAGCTTCAAATTCTCTTAGCCAATGTGG 
TCTGTAAGTGTCTATCGATGAATTTCACCGTTAATTGCCGTAGTATACTGTCCTGT 
ACCGGATGTGAAGAGGAGCAACTCTGCACAGTGCACTGGTTGCTCCCATGGTAGGA 
ANGAATGGCTTATCAATGGTCGGATTT 

Sequence ID 1160 SEQ ID NO: 394 nt : 

650 

GGAGGATGGAGCAGTGAGCGGGTCTGGGCGGCTGCTGGCAGCGCCATGGAGACGGT 
AC AG C T G AG G AAC CCGCCGCGCCG G C AG C T G AAAA AG T T G GAT G AAG A T AG T T T AA 
C C AAAC AAC C AG AAG AAG T AT T T GAT G T C T T AG AG AAAC T T G G AG AAG G G T GAG T G 
TAAAGAAACTATAGGTAGGTCATTGGGTCCCAGTCTTTTTCCTGCCCCAGAAGAAG 
CAGAAGGATATGAACCTTTCAGCATTGTTCTAGGTGGGGTGGAAGGTAAATTTACA 
GCTTGTGATGTCCTTCTTCGCTTTACTCCAATCCCTATTATAGACAGATTTAGTGA 
TTCCTGGTCTTTTTAACACGAAGAATATCTATTGTTTTCTCTTTTGTAGGATCTGT 
AT GAT T T T AT C T AC T T AAC AGAT AGC AC T AAT T AGAT T AAAAT T C T AT AAGAAAC T 
T T T T AAT TTGCTGTT CAT AAT T T C T GAT T GG T AT GC AAT AAC T G T T T C AAT G AAAA 
T C AAT GT AAT T T AGT AT T T T AAT AT T T GC ACC T T T GT GAAAT AT AGT AAAT AAAT T 
AAGCACTATCACCACCTTCACAGCTACTTAGGAGATCCACAATCCTGGGTTGGGAG 
CCAGTGGATTTCCTGAAACACAGATTTGTTAATG 

Sequence ID 1165 SEQ ID NO: 395 nt : 

502 

CTCAAGTGAATCCTGGCTTCTTGGAAGCGCTTGCCTAGACGAGACACAGTGCATAA 
AAACAACTTTTGGGGGACAGGTATGTTTTCTTGCAGCTGCGGTTGTAAGGTCTTGG 
CAAGACAAGCAGTGTGGCCAGAATTTTGAACTTCTGATGAATGTGTAATGCAAAGG 
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ACCTTGTACATTTTTTTGTTTCAAGGTCCTCAAAATGAGCACATGAAGAGGTTGCT 
GTGAAACTTTAAGTGGCCCTACTGCGCAGAAGCATTCAGATGTCACTTGATGATCT 
GTAAGGGAACTTGCTGATTTGGGAATGTGCTTAGGGAACACACATTCCTTTTGACA 
GGGTCTGTCACTGGGTGGGTGATGAATTATACAGATGACATGTGCTTTTTTTTCTT 
T T T T C AAC C T C AAT GGT AT T C C T AC AGG AAAT GGAT AACC AT T T T AAC T GT AT T T T 
TTGCAGCCCGTACCTTCTTGGGAATACAATTGTCTAACTTTTTATTTTTGGTCT 

Sequence ID 1172 SEQ ID NO: 396 nt : 

648 

CCACAATAATAAGAGAAAAACAGGAGCAAAAGGATATACAAAACCACCAGAAAACA 
AAT AAC AAAGT GAC AGGAGT AAGT C C T T AAC T GGC AAT AAT AAC CAT G AAT C T AAA 
T GGAT TCCATTTCC C AC T T AAAAGAT AAAGAC AT GC T G AAT GGAT AAAAAGC T GT C 
ACCCAGTTATATGCTGCCTACAACAAACTCACTTCACCTGTAAACATACATATGGA 
TGGAAAGAGAAGGCATGGGAAAAGATACTCTACTCAAATGAAAACAAAAACCAAAC 
AAAGGTGGCTATTCTTATATGAGATAATACAGACATTAAATCAAAAACTGGAAACA 
AACACAAAGTCATTGTATAATGATGAATTCAATTATATCATGATGAATTCAATTAT 
ATCCTCCTTCCTGATCAATTCAGAAAGGAGGATATAATCTTTTTAAATATATATAC 
ACCCAACACCAGAGCATATAAATATGTAAAGGAAGATAAAGGGAGTCCTGTGATCA 
AGAATAAATATAACAATTATAAATATTTTATCTAAAGTGATAGATAGACTGTAATA 
CAATAATAGGGTGGTGACATTAACACCCCCTCTCACATTGGACTGATCATCTAGAA 
G G G AG AAAAAG CTTTATGATTG GAAAAG C C A T 

Sequence ID 1178 SEQ ID NO: 397 

ATTGTGTTGGCCACCCGGGAATTCGCGGCCGCGTCGACCTACGCACACGAGAACAT 
GCCTCTCGCAAAGGATCTCCTTCATCCCTCTCCAGAAGAGGAGAAGAGGAAACACA 
AGAAGAAACGCCTGGTGCAGAGCCCCAATTCCTACTTCATGGATGTGAAATGCCCA 
GGTGAGGAGACGGCTTGCTGTAGTGGGGAAAGCACTGGACCTCAACAGTTGGAAAA 
TGTTGTAGTGTTAGCTGTCTCGTATCCTTGAAGCTGTGCAGCAGCTTCAGTTTCTT 
CGCCTGTG G AAAAT A T T T T C C C T GAT AC T C T T AAAAT T T G AA T G T AT GAG AC T G G C 
AAAGT TTTGCATCTTAGGAGGAGTGATT CAT TTCACCGTGATCTCTCATCACATTT 
CACATACAACCCCTACGTTTTTTTGTGTTGGGAAACAATGTAATGGATGATGAGTT 
GGGCATAAGTGCAGGAAAGACGGGTGTAATAGAGGAAAAAAATGTTATCTGCTTTT 
CTTTCAGGATGCTATAAAATCACCACGGTCTTTAGCCATGCACAAACGGTAGTTTT 
GTGTGTTGGCTGCTCCACTGTCCTCTGCCAGCCTACAGGAGGAAAAGCAAGGCTTA 
C AGAAGGAT GT T CC T T C AGGAGGAAGC AGC AC T AAAAGC AC T C T G AGT C AAN AT G A 
G T GGGAAAC C AT C T C AAT AAAC AC AT T T T GGAT 



- 253- 



Marked-Up Copy 



Sequence ID 1180 SEQ ID NO: 398 nt : 

622 

CTTTTCCTCCCGCTGTCCCCCACGGGAGGGGACTGCTCTCCCCCGCTGCATCCTTT 
C T G T G AGG T AC C T T AC C C AC C T C AGC AC C T G AG AGGG T G AAAT AG AAT T C T AAC C T 
CGACATTCGGGAAGTGTTTTTGAGAAGTCTCGGTCGGTAAGGGAAGTCTTCCAAGT 
CCGTGCAGCACTAACGTATTGGCACCTGCCTCCTCTTCGGCCACCCCCCAGATGAG 
GCAGCTGTGACTGTGTCAAGGGAAGCCACGACTCTGACCATAGTCTTCTCTCAGCT 
TCCACTGCCGTCTCCACAGGAAACCCAGAAGTTCTGTGAACAAGTCCATGCTGCCA 
TCAAGGCATTTATTGCAGTGTACTATTTGCTTCCAAAGGATCAGGCCCTGAGAACA 
ATGACCTTATTTCCTACAACAGTGTCTGGGTTGCGTGCCAGCAGATGCCTCAGATA 
C C AAGAGAT AAC AAAGC T GC AGC T C T T T T GAT GC T GAC C AAG AAT GT GGAT T T T G T 
G AAGG AT GC AC AT GAAG AAAT GG AGC AGGC T G T GG AAG AAT G T G AC C C T T AC T C T G 
GC C T C T T G AAT GAT AC T G AGG AG AAC AAC T C T G AC ANC C AC AAT CAT G AGG AT GAT 
GTGTTG 

Sequence ID 1181 SEQ ID NO: 399 nt : 

155 

C GCC AC T TAT C C AGT GAACC AC T AT C AC GAAAAAAAC T C T AC C T C T C TAT AC T AAT 
C T CCC T AC AAAT C T CC T T AAT TAT AAC AT T C AC AGCC AC AGAAC T AAT CAT AT T AA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

Sequence ID 1182 SEQ ID NO: 400 

CATTGTGTTGGCNCCCGGGAATTCGCGGCCGCGTCGACTTTTTGTGTTGTTTGGAG 
CAGAAATACTAAAGAAGATTCCGGGCCGAGTATCCACAGAAGTGGACGCAAGGCTC 
TCCTTTGATAAAGATGCGATGGTGGCCAGAGCCAGGCGGCTCATCGAGCTCTACAA 
GGAAGCTGGGATCAGCAAGGACCGAATTCTT AT AAAGC TGTCATCAACCTGGGAAG 
GAATTCAGGCTGGAAAGGAGCTCGAGGAGCAGCACGGCATCCACTGCAACATGACG 
TTACTCTTCTCCTTCGCCCAGGCTGTGGCCTGTGCCGAGGCGGGTGTGACCCTCAT 
CTCCCCATTTGTTGGGCGCATCCTTGATTGGCATGTGGCAAACACCGACAAGAAAT 
CCTATGAGCCCCTGGAAGACCCTGGGGTAAAGAGTGTCACTAAAATCTACAACTAC 
TACAAGAAGTTTAGCTACAAAACCATTGTCATGGGCGCCTCCTTCCGCAACACGGG 
CGAGATCAAAGCACTGGCCGGCTGTGACTTCCTCACCATCTCACCCAAGCTCCTGG 
GAGAGCTGCTGCAGGACAACGCCAAGCTGGTGCCTGTGCTCTCAGCCAAGGCGGCC 
C AAG C C AG T GAC C T G G AAAAAAT C C AC C T G GAT GAG AAG TCTTTCCGTTGGTTGCA 
C AAC G AGG AC C AG AT GGC T G T GG AGAAG 
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Sequence ID 1183 SEQ ID NO: 401 nt : 

479 

CGTGGCAGCCATCTCCTTCTCGGCATCATGGCCGCCCTCAGACCCCTTGTGAAGCC 
C AAG AT C G T C AAAAAG AG AAC C AAG AAG T T C AT C C GGC AC C AG T C AG AC C GAT AT G 
TCAAAATTAAGCGTAACTGGCGGAAACCCAGAGGCATTGACAACAGGGTTCGTAGA 
AGATTCAAGGGCCAGATCTTGATGCCCAACATTGGTTATGGAAGCAACAAAAAAAC 
AAAGCACATGCTGCCCAGTGGCTTCCGGAAGTTCCTGGTCCACAACGTCAAGGAGC 
TGGAAGTGCTGCTGATGTGCAACAAATCTTACTGTGCCGAGATCGCTCACAATGTT 
TCCTCCAAGAACCGCAAAGCCATCGTGGAAAGAGCTGCCCAACTGGCCATCAGAGT 
CACCAACCCCAATGCCAGGCTGCGCAGTGAAGAAAATGAGTAGGCAGCTCATGTGC 
ACGTTTTCTGTTT AAA T AAA T G T A A A A A C T G 

Sequence ID 1185 SEQ ID NO: 402 nt : 

628 

CTTTGATTACCTTTGAGTATTAGGTTGAAAGCTTCTCTGTGCTTGATTGAACATTG 
TGATGATGTTGATTGGGTCATGTCAGATTTAGACAGTGTTGTGTTTAAGATAAATG 
TTTAATGGCTCTTAGCAGTGTTCATGCCTCCCCTTTTCCCCTGATACTTTAAAAAC 
AGAAT AT AC AG AAAAGGGGAGT T GGGT G AAGAAT C AC CAT AT T C T CAT T AC C AGAG 
TAGTGTCTACCAGCTGTTTTCACATTTTTCTGTTTCCTTCTGTCCTTGGAATCCTT 
TTTTTAGATCCTTGTAATACTAGTAAAGATATTCCACTCTGTGTTGTAAGCATTTT 
TCCATTTTGCTCCATGGTCTTCATAATGCCCTGTGGTCCTTTATTAAGGGGATGCA 
CCATGTAGAGGTGAAAGGCTTTCCTTGACTTGGCCACCATTTCTGTATTTTCCTTA 
GAGGAGGAGGTTTCCAACATTTCTTTTTTAGAGACAGAGTCTCGTTCTGACACGCA 
GGCAGGAGTGCAGTGGCATGATAACAGCTCACTGCAGCCTCGAACTCCTGGGCTCA 
AGTTATCCTCCCACCTCAGCTTCCTGAGTAGCTAGGACTGCAGGTGCCTGCCACCA 
CACCCAGCTAAT 

Sequence ID 1186 SEQ ID NO: 403 nt : 

494 

CAGCCCTCCGTCACCTCTTCACCGCACCCTCGGACTGCCCCAAGGCCCCCGCCGCC 
GCCTCCAGCGCCGCGCAGCCACCGCCGCCGCCGCCGCCTCTCCTTAGTCGCCGCCA 
TGACGACCGCGTCCACCTCGCAGGTGCGCCAGAACTACCACCAGGACTCAGAGGCC 
GCCATCAACCGCCAGATCAACCTGGAGCTCTACGCCTCCTACGTTTACCTGTCCAT 
G T C T T AC T AC T T T GACC GCGAT GAT GT GGC T T T GAAGAAC T T T GC C AAAT AC T T T C 
TTCACCAATCTCATGAGGAGAGGGGAACATGCTGAGAAACTGATGAAGCTGCAGAA 
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CCAACGAGGGTGGCCGAATCTTCCTTCAGGATATCAAGAAACCAGACTGTGATGAC 
TGGGAGAGCGGGCTGAATGCAATGGAGTGTGCATTACATTTGGAAAAAAATGTGAA 
T C AG T C AC T AC T GGAAC T GC AC AAAC T GGCC AC T G AC AAAAAT GAC 

Sequence ID 1188 SEQ ID NO: 404 nt : 

599 

GGGAGACAAGCCCAGCCTTTCGGCGAGNATACGTCTAACCCTGTGCAACAGCCACT 
ACATTACTTCAAACTGAGATCCTTCCTTTTGAGGGAGCAAGTCCTTCCCTTTCATT 
TTTTCCAGTCTTCCTCCCTGTGTATTCATTCTCATGATTATTATTTTAGTGGGGGC 
GGGGTGGGAAAGATTACTTTTTCTTTATGTGTTTGACGGGAAACAAAACTAGGTAA 
AATCTACAGTACACCACAAGGGTCACAATACTGTTGTGCGCACATCGCGGTAGGGC 
GTGGAAAGGGGCAGGCCANAGCTACCCGCAGAGTTCTCAGAATCATGCTGAGAGAG 
CTGGAGGCACCCATGCCATCTCAACCTCTTCCCCGCCCGTTTTACAAAGGGGGAGG 
CTAAAGCCCAGAGACAGCTTGATCAAAGGCACACAGCAAGTCAGGGTTGGAGCAGT 
AGCTGGAGGGACCTTGTCTCCCAGCTCAGGGCTCTTTCCTCCACACCATTCAGGTC 
TTTCTTTCCGAGGCCCCTGTCTCAGGGTGAGGTGCTTGAGTCTCCAACGGCAAGGG 
AACAAGTACTTCTTGATACCTGGGATACTGTGCCCAGAG 

Sequence ID 1189 SEQ ID NO: 405 

GGGAGACAAGCCCAGCCTTTCGGCGAGATACGTCTAACCCTGTGCAACAGCCACTA 
C AT T AC T T C AAAC T GAG AT CCTTCCTTTT GAG G G AG C AAG TCCTTCCCTTTCATTT 
TTTCCAGTCTTCCTCCCTGTGTATTCATTCTCATGATTATTATTTTAGTGGGGGCG 
GGGTGGGAAAGATTACTTTTTCTTTATGTGTTTGACGGGAAACAAAACTAGGTAAA 
ATCTACAGTACACCACAAGGGTCACAATACTGTTGTGCGCACATCGCGGTAGGGCG 
TGGAAAGGGGCAGGCCAGAGCTACCCGCAGAGTTCTCAGAATCATGCTGAGAGAGC 
TGGAGGCACCCATGCCATCTCAACCTCTTCCCCGCCCGTTTTACAAAGGGGGAGGC 
TAAAGCCCAGAGACAGCTTGATCAAAGGCACACAGCAAGTCAGGGTTGGAGCAGTA 
GCTGGAGGGACCTTGTCTCCCAGCTCAGGGCTCTTTCCTCCACACCATTCAGGTCT 
TTCTTTCCGAGGCCCCTGTCTCAGGGTGAGGTGCTTGAGTCTCCAACGGCAAGGGA 
ACAAGTACTTCTTGATACCTGGGATACTGTGCCCAGAGCCTCGAGGAGGT 

Sequence ID 1190 SEQ ID NO: 406 

GTTTAAATTTGACAAACTAAAGCTNATNACTGCTATAAGAGTAATAACTGCTCATT 
TTCCATAACTCATTCTTAAAGTTTTAGTAATGTAAAAGTTATTTTTTTGCAGTAAG 
TTATAATGATAGAAGCTTACATGTTTTTTCATGCCTCATCTGTTTCCCCTTAAAAC 
TATAATTATCAGTAAAGTCCTGTGGTATTTTTCAATTTGTAAGAAACTAGGCTATA 
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TATACATTGGGAAAAACAGCCTTCATTTGTCAATGCACTAGTGTTCCAAAGGTTTC 
T GG T AAT TGTGTGCTATTGCTTTTTGTT G AC T T GC AAAAAAAAAAAAAAAAAAAT T 
ACTATGACTTGNGGTAGCCCTGCAACCTTCGGAAGTGCTTAGCCCAGTCTGACCAT 
ACATTTATATTTANAATGCTTAGGTAAATAAATAATATGCCTAAACCCAATGCTAT 
AAGATACTATATAATATCTCATAATTTTAAAAATCACTGTTTTGTATAATAATAAA 
ACAAGGCAGGCAAGCTGTTCTACAATGACTGTTGGTAAGGGTGCTGAGGAAGAAAA 
ACAAACAATCTTGATTCAGGGATAGTGAATAGACAAAAAATGTCCTAATCAATGAA 
G C T G T G T GAT GAT T C T GAT T G AC AG AG A 

Sequence ID 1191 SEQ ID NO: 407 

G T GC AAAG TGTTATAT C C AC T T T C AAC AAAG AG AG AAGC T G AAAAGC T AAC C C AAT 
GTTAATTTTGGATCACACACATTCAGTGTAGACTTTAAGATTTTACTTCTGTTGGA 
GTAGCTATATTATTTCTAGTTAAAAAACTCTCTATATACATATTTATTTGTTTTTC 
TACTTGTTTAATATTTTTCTCTTCCAATTAGGAACTCAATATGGAATAAAAAATAT 
TTAAATGTATTTTACTCAAACGTGTGTGTATATATGTTTGTGTGCATGATAAGGAG 
AGTGAGAGCAAGAGTAAGAGAGAGAGAGCACGCATAGATGGAAGCACACATTTAAT 
GTCTATGAAATGAGAAAACATTAAGGCTAAGATATTTTTCCTTCTGAACTAGCAGA 
TTGTATCAATGGCTGGTCACTTAAATTAATCAGTTTGTAAAGATATTTAAAAGGTA 
TGTCTACCTTCTTGCAATTAATTTGATTATGTTCTAATGGCATGGCAAGAGAAATG 
A AAG AAG A T AAC T AAAAG T T AAAAG TCGTTGCATGTTTTTGTTG C AG CAT AC C C T T 
CTTTCAGGCTACCGAATAACCTTGATTGACATTGGATTAGTAGTAGAATACCTCAT 
TGGTAGAGCATATCGCAGCANCTACACTAGAAAACAT 

Sequence ID 1192 SEQ ID NO: 408 

GTCTGGAACTCCAGACCTCAGGTGATACCCCTGCCTCAGCCTCCCAATGTGCTGGG 
ATTACAGCTGTGAAGCCACCGCGCCCGGCTGCTGTGATAGTTGAGATGTAAACCAA 
AAAT AAAA T T C T AAG C C AC C C AAT C C G AC T G AAT G G AC CCTTCCTGTT G AG C AAG G 
ACATTCCAAAGTAAACTGAAAAGACCAGCTTAGGCCATGATGGGAAGGGGAGGTGT 
CAACATGCCTCATTCTACCTTCCTCCCTCTGGAATCCAGACACAACTGACCAGCAT 
TAACATTAAAACAGAGATCTTAAGCTGGGCACGGTGGCTCATGCCTGTAATCCCAG 
CACTTTGGGAGGCCAAGGTGGGATCACCTGAGGTCGGAAGTTCAAGACCAGCCTGG 
CCGGTATGGTGAAGCCATGTCTCTACTGAAAATGCAAAATTGGCCGGACATTGTGG 
TGCA 



Sequence ID 1193 SEQ ID NO: 409 

TNCNTTTTTTTTCCCNCGGGAAAGCGCGCCATTGTGTTGGTCCCCGGGAATTCGCG 
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GCCGCGTCGACGAGAAATGGCTTGAACCCAGTAGGCAGAGGTTGTAGTGAGCCCAG 
AATNGGNCACCTGCACNTTTANCCNTGGGTGACAAAANTGAAAACTTTGTCTNAAA 
A A A A A A A A A A A A A A A T T T T A AN T N AAAT N AAAAAN C C T T TNCNT TNT T T T TNAAAN 
NGGGGGGGGNNTTTTTNGGGNTTNGNNNTGGTAAAAANTNNNTTTTTTTTTTTTTA 
GGGGCCNANNCCCCNTTTTANAAAANCCNGNTTTTNAAAAAANTTTTTTNCCCNCN 
NTTNGGGGGGGGGGNTTTTNANCNNTNTTNGGGGGGGNNCCCCTNTTANNACCNNC 
AAANTTTTTANTTTTTTGNNNAANNNCCCCCTTTTTTNNTTTTTTTTGNGGGGGGG 
GGGNNGCCCCCNNCCTTTNGGGGGGGGGGNTTNNGNAAAANNACTTTTNAAAANNA 
AGGGNNGGGGGNANATNNCCCCCCCNGGNTTTTTTTTTTAAAAANTNAANNGGGGG 
GGGNNNCTNANTNGGGGCNCCCANNGGGGGNTTANAANNATTTTCTNCCCAAACCC 
CCNGNTTTTATNNCCCCCCCCCCCCNCNNNNGAANGGGNGGNCCNTTTTTTTTATT 
T T T NN G GN G G GN AAAAAAN T T T N AA AAANN ANN A T N TTTTTTCCCCCCCCCCCCNC 
T T T T N G GN AAAN C CNN G G G G G GN TCCTTTTT N AAANNNN C C C C C AAAAAAAAN T T T 
TTTTNTTNTNTTTTTCTCTNGGGGNCCNNANTTNTANANTTTTNCNCCNAAAAAAA 
ANGGGNCCCCTTTTTTTNCNGGNNGGNNCCCAAAANNTTTTTTTTNAAAAAAAAAA 
AAAA 

Sequence ID 1195 SEQ ID NO: 410 

G T T C G T G ACN T T C GG AGC T AC C T G AC AG AGC AG AG T C AAC C AGGN T C T GC C C AAAG 
AG AG T G T T AG G C C T GAG C T T GAG AG C C C T G GAG AG AC G T G T G C AC AAAAT G T G AC C 
TGAGGCCCTAGTCTAGCAAGAGGACATAGCACCCTCATCTGGGAATAGGGAAGGCA 
CCTTGCAGAAAATATGAGCAATTTGATATTAACTAACATCTTCAATGTGCCATAGA 
CCTTCCCACAAAGACTGTCCAATAATAAGAGATGCTTATCTATTTTA 

Sequence ID 1196 SEQ ID NO: 411 nt : 

412 

GTCGACGCGGCCGCGGTCGCTGGAGNCGATCAACTCTAGGCTCCAACTCGTTATGA 
AAAGTGGGAAGTACGTCCTGGGGTACAAGCAGACTCTGAAGATGATCAGACAAGGC 
AAAG CG AAAT T GGT C AT T C T C GC T AAC AAC T GCC C AGC T T T GAG G AAAT C T G AAAT 
AGAGTACTATGCTATGTTGGCTAAAACTGGTGTCCATCACTACAGTGGCAATAATA 
TTGAACTGGGCACAGCATGCGGAAAATACTACAGAGTGTGCACACTGGCTATCATT 
GATCCAGGTGACTCTGACATCATTAGAAGCATGCCAGAACAGACTGGTGAAAAGTA 
AACCTTTTCACCTACAAAATTTCACCTGCAAACCTTAAACCTGCAAAATTTTCCTT 
TAATAAAATTTGCTTGTTTT 



Sequence ID 1197 SEQ ID NO: 412 
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CCGCCAACATGGGCCGCGTTCGCACCAAAACCGTGAAGAAGGCGGCCCGGGTCATC 
ATAGAAAAGTACTACACGCGCCTGGGCAACGACTTCCACACGAACAAGCGCGTGTG 
CGAGGAGATCGCCATTATCCCCAGCAAAAAGCTCCGCAACAAGATAGCAGGTTATG 
T C AC GC AT C T GAT G AAGC G AAT T C AG AG AGGC C C AG T AAG AGG T AT C T C C AT C AAG 
C T GC AGGAGGAGGAGAGAGAAAGGAGAG AC AAT T AT GT T C C T GAGGT C T C AGCC T T 
GGATCAGGAGATTATTGAAGTAGATCCTGACACTAAGGAAATGCTGAAGCTTTTGG 
ACTTCGGCAGTCTGTCCAACCTTCAGGTCACTCAGCCTACAGTTGGGATGAATTTC 
AAAACGCCTCGGGGACCTGTTTGAATTTTTTCTGTAGTGCTGTATTATTTTCAATA 
AATCTGGGACAA 

Sequence ID 1198 SEQ ID NO: 413 

CAGAGGTGGGAGGATTGCTTCAGTTCAAGAGTTTGAGACCAGCCTGGGTAACATGG 
CGAAACCCTGTCTTTACAAAAAATGCAAACCTTTGCCGCATGTGTTGGGGTGCGCC 
TGTAGTCCCAGCTTCTCGGGAGGCTGAGGTGGGGGGACCACCTGAGCCATGGAGGT 
TGAGGCTGCAGTGAGCCGTGATACCACCACTGTACTCTAGCCTGGGCCATAGAGTG 
AG AC AC C C T G C C T C AG AAAT A 

Sequence ID 1199 SEQ ID NO: 414 nt : 

439 

CCCATCCCCTCGACCGCTCGCGTCGCATTTGGCCGCCTCCCTACCGCTCCAAGCCC 
AGCCCTCAGCCATGGCATGCCCCCTGGATCAGGCCATTGGCCTCCTCGTGGCCATC 
TTCCACAAGTACTCCGGCAGGGAGGGTGACAAGCACACCCTGAGCAAGAAGGAGCT 
GAAGGAGCTGATCCAGAAGGAGCTCACCATTGGCTCGAAGCTGCAGGATGCTGAAA 
TTGCAAGGCTGATGGAAGACTTGGACCGGAACAAGGACCAGGAGGTGAACTTCCAG 
GAGTATGTCACCTTCCTGGGGGCCTTGGCTTTGATCTACAATGAAGCCCTCAAGGG 
CTGAAAATAAATAGGGAAGATGGAGACACCCTCTGGGGGTCCTCTCTGAGTCAAAT 
CCAGTGGTGGGTAATTGTACAATAAATTTTTTTTTGGTCAAATTTAA 

Sequence ID 12Q0 SEQ ID NO: 415 nt : 

526 

CTGGAGACGACGTG C AG A A A TGGCACCTC G A A AG G G G A AG G A A A AG A AG G A AG A AC 
AGGT CATC AGCC TCGGACCTCAGGTGGCTGAAGGAGAGAATGT AT TTGGTGTCTGC 
CATATCTTTGCATCCTTCAATGACACTTTTGTCCATGTCACTGATCTTTCTGGCAA 
GGAAACCATCTGCCGTGTGACTGGTGGGATGAAGGTAAAGGCAGACCGAGATGAAT 
CCTCACCATATGCTGCTATGTTGGCTGCCCAGGATGTGGCCCAGAGGTGCAAGGAG 
CTGGGTATCACCGCCCTACACATCAAACTCCGGGCCACAGGAGGAAATAGGACCAA 
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GACCCCTGGACCTGGGGCCCAGTCGGCCCTCANAGCCCTTGCCCGCTCGGGTATGA 
AGATCGGGCGGATTGAGGATGTCACCCCCATCCCCTCTGACAGCACTCGCAGGAAG 
GGGGGTCGCCGTGGTCGCCGTCTGTGAACAAGATTCCTCAAAATATTTTCTGTTAA 
TAAATTGCCTTCATGTAAACTG 

Sequence ID 1201 SEQ ID NO: 416 nt : 

613 

CTTAAGTATGCCCTGACAGGAGNATGAAGTAAAGAAGATTTGCATGCAGCGGTTCA 
TTAAAATCGATGGCAAGGTCCGAACTGATATAACCTACCCTGCTGGATTCATGGAT 
GTCATCAGCATTGACAAGACGGGAGAGAATTTCCGTCTGATCTATGACACCAAGGG 
TCGCTTTGCTGTACATCGTATTACACCTGAGGAGGCCAAGTACAAGTTGTGCAAAG 
T GAG AAAG AT C T T T G T GGGC AC AAAAGG AAT C CC T CAT C T GG T GAC T CAT GAT GC C 
CGCACCATCCGCTACCCCGATCCCCTCATCAAGGTGAATGATACCATTCAGATTGA 
TTTAGAGACTGGCAAGATTACTGATTTCATCAAGTTCGACACTGGTAACCTGTGTA 
TGGTGACTGGAGGTGCTAACCTAGGAAGAATTGGTGTGATCACCAACAGAGAGAGG 
CACCCTGGATCTTTTGACGTGGTTCACGTGAAAGATGCCAATGGCAACAGCTTTGC 
CACTCGACTTTCCAACATTTTTGTTATTGGCAAGGGCAACAAACCATGGATTTCTC 
TTCCCCGAGGAAAGGGTATCCGCCTCACCATTGCTGAAGAGAGAGACAAAAGA 

Sequence ID 1202 SEQ ID NO: 417 

G G AAT TCGCGGCCGCGTC GAC CTCTGCTC G AA T T G AC AG AAAAG G AT T C T G T G AAG 
AGTGATGAGATTTCCATCCATGCTGACTTTGAGAATACATGTTCCCGAATTGTGGT 
CCCCAAAGCTGCCATTGTGGCCCGCCACACTTACCTTGCCAATGGCCAGACCAAGG 
TGCTGACTCAGAAGTTGTCATCAGTCAGAGGCAATCATATTATCTCAGGGACATGC 
GCATCATGGCGTGGCAAGAGCCTTCGGGTTCAGAAGATCAGGCCTTCTATCCTGGG 
C T GCAACAT CC T T CGAGTTGAATATTCCT TACT GAT C TAT GTTAGCGTTCCTGGAT 
CCAAGAAGGTCATCCTTGACCTGCCCCTGGTAATTGGCAGCAGATCAGGTCTAAGC 
AGCAGAACATCCAGCATGGCCAGCCGAACCAGCTCTGAGATGAGTTGGGTAGATCT 
GAACATCCCTGATACCCCAGAAGCTCCTCCCTGCTATATGGATGTCATTCCTGAAG 
ATCACCGATTGGAGAGCCCAACCACTCCTCTGCTAGATGACATGGATGGCTCTCAA 
GACAGCCCTATCTTTATGTATGCCCCTGAGTTCAAGTTCATGCCACCACCGACTTA 
TACTGAGGTGGATCCCTGCATCCTCAACAACAATGTGCAGTGAGCAT 



Sequence ID 1203 SEQ ID NO: 418 nt : 

692 

TGCAGAGGGGTCCATACGGCGTTGTTCTGGATTCCCGTCGTAACTTAAAGGGAAAC 
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TTTCACAATGTCCGGAGCCCTTGATGTCCTGCAAATGAAGGAGGAGGATGTCCTTA 
AGTTCCTTGCAGCAGGAACCCACTTAGGTGGCACCAATCTTGACTTCCAGATGGAA 
C AGT AC AT C T AT AAAAGG AAAAGT GAT GGC AT C TAT AT C AT AAAT C T C AAGAGGAC 
CTGGGAGAAGCTTCTGCTGGCAGCTCGTGCAATTGTTGCCATTGAAAACCCTGCTG 
ATGTCAGTGTTATATCCTCCAGGAATACTGGCCAGAGGGCTGTGCTGAAGTTTGCT 
GCTGCCACTGGAGCCACTCCAATTGCTGGCCGCTTCACTCCTGGAACCTTCACTAA 
CCAGATCCAGGCAGCCTTCCGGGAGCCACGGCTTCTTGTGGTTACTGACCCCAGGG 
CTGACCACCAGCCTCTCACGGAGGCATCTTATGTTAACCTACCTACCATTGCGCTG 
TGTAACACAGATTCTCCTCTGCGCTATGTGGACATTGCCATCCCATGCAACAACAA 
GGGAGCTCACTCAGTGGGTTTAATGTGGTGGATGCTGGCTCGGGAAGTTCTGCGCA 
TGCGTGGCACCATTTCCCGTGAACACCCATGGGAGGTCATGCCTGATCTGTACTTC 
T AC AG AG AT C C T G AAG AG AT 

Sequence ID 12CM SEQ ID NO: 419 

TTTTTTTTTTTTTCCTGCGGGAAAGCGCGCCATTGTGTTGGTACCCGGGAAATTCG 
CGGCCGCGTCGACACAGGCCCCAGCATCAAGATCTGGGATTTAGAGAGGAAAGATC 
ATTGTAGATGAACTGAAGCAAGAAGTTATCAGTACCAGCAGCAAGGCAGAACCACC 
CCAGTGCACCTCCCTGGCCTGGTCTGCTGATGACACAGGTTGGGCNGGNNCNCNGG 
GGNGGNNNNGNNNNGCNGNNGGNNCNGNNNNCNNNNNGCNNNNGNNNNTNNNCNNN 
GNNCNNNNNNNNNNNNNNNNNGNTCNNGNNGCNGGGGCCNGGNCGNCGCGGNCGCG 
NNTNNNNGGGTNCNNNCNCNNNGGCGCGC 

Sequence ID 1205 SEQ ID NO: 420 

CAGACTCTGACCCAGCCTCAGTCCTAACTCCTGGGGCTGGGCTGAGGGGAACAAGC 
AT T T GC T GAAAC T T G AAAAAAC AAAGC AAAT C A A A A AC AG G A A A A A A T T G T AC C T G 
GTACTTTTTTT T AG AAAAAAAG AT T A A A A A AG A A AG A A T AAA T T C T T G T T T GG AAA 
C T T G AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAT T T TAAAC T C 

TNNNNNTNNCNNCNANTAANNCANNTCNANNNNANNNAATTACTTNNANGTNNNTC 
ACN 

Sequence ID 3r^S ^SEQ ID NO: 421 nt : 

642 

ACGAGAAGCCAGATACTAAAGAGAAGAANCCCGAAGCCAAGAAGGTTGATGCTGGT 
GGC AAG G T G A A A A AG G G T A AC C T C A A AG C T A A A AAG C C C A AG A AG G G G A AG C C C C A 
TTGCAGCCGCAACCCTGTCCTTGTCAGAGGAATTGGCAGGTATTCCCGATCTGCCA 
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TGTATTCCANAAAGGCCATGTACAAGAGGAAGTACTCAGCCGCTAAATCCAAGGTT 
GAAAAGAAAAAGAAGGAGAAGGTTCTCGCAACTGTTACAAAACCAGTTGGTGGTGA 
CAAGAACGGCGGTACCCGGGTGGTTAAACTTCGCAAAATGCCTAGATATTATCCTA 
C T G A AG A T G T G C C T C G A A AG CTGTTGAGCCACGGC A A A A A AC CCTTCAGTCAGCAC 
GTGAGAAAACTGCGAGCCAGCATTACCCCCGGGACCATTCTGATCATCCTCACTGG 
ACGCCACAGGGGCAAGAGGGTGGTTTTCCTGAAGCAGCTGGCTAGTGGCTTATTAC 
TTGTGACTGGACCTCTGGTCCTCAATCGAGTTCCTCTACGAAGAACACACCAGAAA 
TTTGTCATTGCCACTTCAACCAAAATCGATATCAGCAATGTAAAAATCCCAAAACA 
TCTTACTGATGCTTACTTCAAAAAGA 

Sequence ID 1208 SEQ ID NO: 422 

CCCTATACCTTCTGCATAATGAATTANCTAGAAATAACTTTGCAAGGGAGAGCCAA 
AGCTAAGACCCCCGAAACCAGACGAGCTACCTAAGAACAGCTAAAAGAGCACACCC 
GTCTATGTAGCAAAATAGTGGGAAGATTTATAGGTAGAGGCGACAAACCTACCGAG 
CCTGGTGATAGCTGGTTGTCCAAGATAGAATCTTAGTTCAACTTTAAATTTGCCCA 
CAGAACCCTCTAAATCCCCTTGTAAATTTAACTGTTAGTCCAAAGAGGAACAGCTC 
TTTGGACACTAGGAAAAAACCTTGTAGAGAGAGTAAAAAATTTAACACCCATAGTA 
GGCCTAAAAGCAGCCACCAATTAAGAAAGCGTTCAAGCTCAACACCCACTACCTAA 
AAAAT C C C AAAC AT AT AAC T G AAC T C C T C AC AC C C AAT T GG AC C AAT C T AT C AC C C 
T AT AGAAGAAC T AAT GT T AGT AT AAGT AAC AT GAAAAC AT T C T CC T CCGC AT AAG 

Sequence ID 1209 SEQ ID NO: 423 nt : 

620 

CTCTCCTGTCAACAGCGGCCAGCCTCCCAACTACGAGAATGCTCAAGGAGGAGCAG 
GAAGTGGCTATGCTGGGGGCGCCCCACAACCCTGCTCCCCCGACGTCCACCGTGAT 
CCACATCCGCAGCGAGACCTCCGTGCCCGACCATGTCGTCTGGTCCCTGTTCAACA 
CCCTCTTCATGAACACCTGCTGCCTGGGCTTCATAGCATTCGCCTACTCCGTGAAG 
TCTAGGGACAGGAAGATGGTTGGCGACGTGACCGGGGCCCAGGCCTATGCCTCCAC 
CGCCAAGTGCCTGAACATCTGGGCCCTGATTTTGGGCATCTTCATGACCATTCTGC 
TCGTCATCATCCCAGTGTTGGTCGTCCAGGCCCAGCGATAGATCAGGAGGCATCAT 
TGAGGCCAGGAGCTCTGCCCGTGACCTGTATCCCACGTACTCTATCTTCCATTCCT 
CGCCCTGCCCCCAGAGGCCAGGAGCTCTGCCCTTGACCTGTATTCCACTTACTCCA 
CCTTCCATTCCTCGCCCTGTCCCCACAGCCGAGTCCTGCATCAGCCCTTTATCCTC 
ACACGCTTTTCTACAATGGCATTCAATAAAGTGTATATGTTTCTGGTGCTGCTGTG 
ACTT 
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Sequence ID 1210 SEQ ID NO: 424 

TTCGTAATTAGAATACTGTTTGGACTTGCTCAACAAGCACCTTATCTTAACAAAAA 
G T AAC T T AT AG AAAAGGG AGAC AT T CAT T T AAC T T C AAGCC C AT AT T AT T C T T AAA 
AGCTGACTCTTGAAATAGTATTTATTGAGTCATAGTGGAGTCATGGGACTTTTTAA 
GGGCCGGAAGGGACTATTTAGATCATCCAGTCCCACCCTGTCATTTTATGGAGGAG 
GAAACTGAGGCCTAGATAAGATAACCAGTTAGTGGGTCCACTGACCTTTAGGACAG 
TAGTCTATCCGTAAGAGACAACATGGAGAAAGAAATACAACGTTTTTATAGTGAAT 
TATCATCTTACAAAGAATATTCTTCCCATATCGCACTTTTAAAAAGTGGGTACCTT 
AGTCAAATAGGAGAAAAAACCACTTGAGTAGTTTCATCCTCAGGTTTTAGGTGAGG 
AAACTGATACTCAGATTAAATAACTTTAAGCACACAGAGCCTGAATGATAGTCTTA 
TTTGAGCTCATCTGTGCTTTTAATGTGTACTACGTTAGGTGTTTTCACTTGCATTT 
CCTTTAGTCTTATTTGAGCTCATCTGTGCTTTTAATGTGTACTACGTTAGGTGTTT 
TCACTTGCATTTCCTTGTTTGACGTTGACAATAAATCGTGAAGCTGCCTTATCTAA 
GGAAGTCCTAAAGTAAATCATTGGAACACA 

Sequence ID 1211 SEQ ID NO: 425 

CCATTGTGTTGGNACCCGGGAATTCGCGGCCGCGTCGACGGAGTTTTACCTTATTA 
CACTTTAATCTCTGGATTTACCCCATCTCATTTCTCTTTTAGGAAAACTGTTTGTA 
TGTGGTGGCTTTGATGGTTCTCATGCCATCAGTTGTGTGGAAATGTATGATCCAAC 
TAG AAA T GAAT GGAAGAT GAT GGGAAAT AT GAC T T C AC C AAGGAGC AAT GC T GGG A 
TTGCAACTGTAGGGAACACCATTTATGCAGTGGGAGGATTCGATGGCAATGAATTT 
CTGAATACGGTGGAAGTCTATAACCTTGAGTCAAATGAATGGAGCCCCTATACAAA 
GATTTTCCAGTTTTAACAAATTTAAGACCCTCTCAAACTAACAGGCTTAGTGATGT 
AATTATGGTTAGCAGAGGTACACTTGTGAATAAAGAGGGTGGGTGGGTATAGATGT 
TGCTAACAGCAACACAAAGCTTTTGCATATTGCATACTATTAAACATGCTGTACAT 
ACTTTTTGGGTTTATTTGGAAAGGAATGCAAAGATGAAGGTCTGTTTTGTGTACTT 
T T AAGAC TTTGGTTATTT T AC T T T T T GG AAAAGAAT AAACC AAGAAT T GAT T GGGC 
AC AT CAT T T CAAGAAG 



Sequence ID 1212 SEQ ID NO: 426 nt : 

374 

AGAGCAGCAGCCATGGCCCTACGCTACCCTATGGCCGTGGGCCTCAACAAGGGCCA 
CAAAGTGACCAAGAACGTGAGCAAGCCCAGGCACAGCCGACGCCGCGGGCGTCTGA 
CCAAACACACCAAGTTCGTGCGGGACATGATTCGGGAGGTGTGTGGCTTTGCCCCG 
TACGAGCGGCGCGCCATGGAGTTACTGAAGGTCTCCAAGGACAAACGGGCCCTCAA 
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ATTTATCAAGAAAAGGGTGGGGACGCACATCCGCGCCAAGAGGAAGCGGGAGGAGC 
TGAGCAACGTACTGGCCGCCATGAGGAAAGCTGCTGCCAAGAAAGACTGAGCCCCT 
CCCCTGCCCTCTCCCT G AAAT AAAG AAC AG C T T G AC AG 

Sequence ID 1213 SEQ ID NO: 427 nt : 

567 

GAATTATTGACTTTGAATTGCATTTCAGTACCATGAAGTCAAAGTCAGTGGTGTAT 
TTGCTCATTTGTTCATTCTTTCTTTTCCACCAACATTACTGCCTGCAGAGCCAGAG 
GTGAGTGCAGAAATCCTGTCAATTCGTCACTTGTGGACAACCTGCAGCTTGCCACA 
GCCTACAGTTCCACCACTGT G AC C T C T GAAAAC C T C C T G AAC AAAAG GAAG GAG AC 
TTGGAAATCCTGAATGGGCTTGGAGACATTAAGGGAGAACTGCCTCCCTGGACCAA 
GGC AGAAT T C AAT AGAAC C AGC AAGAAAT T T T C C T AT G AAT GGGAAAGC AGG T GGC 
AGGGGGC AGGGGT GGAAAAGC T T T G T AC AGG AAT T GT GGAAAAGC T T T T GC AT TAT 
CTCTAGTCTGAAAGTCACATTTCTCAGTTCCTTTCCACTCTCTTCTGTCAACTTGC 
TGTGAGTAAATGACATCTGTCACCTGTGACACGGGCCAGGGACTATCACCATATGG 
CCCCCACACATTATCTAGTACCAGCCTGCCTGGGCCATGCCTTTTCCAGTCACTGT 
ACCAGCC 

Sequence ID - 1214 SEQ ID NO: 428 nt : 

620 

CTCTCCTGTCAACAGCGGCCAGCCTCCCAACTACGAGAATGCTCAAGGAGGAGCAG 
GAAGTGGCTATGCTGGGGGCGCCCCACAACCCTGCTCCCCCGACGTCCACCGTGAT 
CCACATCCGCAGCGAGACCTCCGTGCCCGACCATGTCGTCTGGTCCCTGTTCAACA 
CCCTCTTCATGAACACCTGCTGCCTGGGCTT CAT AGCATTCGCCT ACT CCGT GAAG 
TCTAGGGACAGGAAGATGGTTGGCGACGTGACCGGGGCCCAGGCCTATGCCTCCAC 
CGCCAAGTGCCTGAACATCTGGGCCCTGATTTTGGGCATCTTCATGACCATTCTGC 
TCGTCATCATCCCAGTGTTGGTCGTCCAGGCCCAGCGATAGATCAGGAGGCATCAT 
TGAGGCCAGGAGCTCTGCCCGTGACCTGTATCCCACGTACTCTATCTTCCATTCCT 
CGCCCTGCCCCCAGAGGCCAGGAGCTCTGCCCTTGACCTGTATTCCACTTACTCCA 
CCTTCCATTCCTCGCCCTGTCCCCACAGCCGAGTCCTGCATCAGCCCTTTATCCTC 
ACACGCTTTTCTACAATGGCATTCAATAAAGTGTATATGTTTCTGGTGCTGCTGTG 
ACTT 

Sequence ID 1215 SEQ ID NO: 429 

C AC A AG A T AG A A T G G T A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A T T T T AA 
GTGACAGTGCCATAGTTTGGACAGTACCTTTCAATGATTAATTTTAATAGCCTGTG 
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AGTCCAAGTAAATGATCACTTTATTTGCTAGGGAGGGAAGTCCTAGGGTGGTTTCA 
GTTTCTCCCAGACATACCTAAATTTTTACATCAATCCTTTTAAAGAAAATCTGTAT 
TTCAAAGAATCTTTCTCTGCAGTAAATCTCGCAGGGGAATTTGCACTATTACACTT 
GAAAGTTGTTATTGTTAACCTTTTCGGCAGCTTTTAATAGGAAAGTTAAACGTTTT 
AAAC AT GG T AG T AC T GGAAAT T T T AC AAGAC T T T T AC C T AGC AC T T AAAT AT GT AT 
AAATGTACATAAAGACAAACTAGTAAGCATGACCTGGGGAAATGGTCAGACCTTGT 
ATTGTGTTTTTGGCCTTGAAAGTAGCAAGTGACCAGAATCTGCCATGGCAACAGGC 
TTTAAAAAAGACCCTTAAAAAGACACTGTCTCAACTGTGGTGTTAGCACCAGCCAG 
CTCTCTGTACATTTGCTAGCTTGTAGTTTTCTAAGACTGAGTAAACTTCTTATTTT 
TAGAAAGTGGAGGTCTGGTTTGTAACTTTCCTTGTACTTAATTGGGTAAAAGT 

Sequence ID 1216 SEQ ID NO: 430 nt : 

484 

CAACCTTAGCCAAACCATTTACCCAAATAAAGTATAGGCGATAGAAATTGAAACCT 
GGCGCAATAGATATAGTACCGCAAGGGAAAGATGAAAAATTATAACCAAGCATAAT 
ATAGCAAGGACTAACCCCTATACCTTCTGCATAATGAATTAACTAGAAATAACTTT 
GCAAGGAGAGCCAAAGCTAAGACCCCCGAAACCAGACGAGCTACCTAAGAACAGCT 
AAAAGAGCACACCCGTCTATGTAGCAAAATAGTGGGAAGATTTATAGGTAGAGGCG 
AC AAAC C T AC C GAG CCTGGTGAT AG CTGGTTGTCC AAG A T AG AA T C T T AG T T C AAC 
TTTAAATTTGCCCACAGAACCCTCTAAATCCCCTTGTAAATTTAACTGTTAGTCCA 
A AG AG G AAC AG C T C T T T G G AC AC T AG G A AAAAAC C T T G T AG AG AG AG T AAAA AA T T 
TAACACCCATAGTAGGCCTAAAAGCAGCCACCAATT 

Sequence ID 1217 SEQ ID NO: 431 

GACAGGCGGGGGCCCAGCGGCCGGGTGAAGGCCGGGTGGCTCTGTGAATCAAAGGA 
GAGTCCCAGAAAACCTGTGACTGTTGAAGAAAATTCATCTGTGAATTTTTATATTC 
AAGGAGT C AGT AT T TAT AT T CAT C T T T T AAAC T GGGAAGAT T TAT AT T T T AC T T T A 
AAAC T T C T T GAT AAT AAT T T AC AAT GAAT GGAC AC AGT GAT GAAG AAAG T GT T AG A 
AAC AGT AGT GGAGAAT C AAGGT AAGT AAGC AC T T T GT T AT C AAT T GT T T AC TAT GA 
AGAGAGTTGAAAACTTGACTTTTTTCTTTATTGTTATTGTTGTTATTTAGTTTTCC 
T CAT AGGTAGCAGAGTTTTCAGGTTTTCCTCTTAGCTATCC AAAT ACT AAAAAAAT 
TCTGATATACGAACCTTTTTTCATAATACAGGTTTTAATTATATTTTTCATTCAGA 
TACACAGTAGATCTTAAATATAGAAAGTTTTTGTTTACTTAAATCTATTTGGAAGT 
TTATATTTGAGCTAATAATTAAGCTGGAGCATGTATAATAGATTTAAATTGTTTTG 
ACTGTTAGTGAAATTT 
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Sequence ID 1218 SEQ ID NO: 432 

CTCACTTGGTGGGTGAGCCTCCAATGACTACACCCAAGGAGGATTTAACACAGGGA 
TTTTATGACTTGCAACAAGTCAGGAGGACATGGGGTTGGGGTAGTTCAGCAGTGCC 
TGTCTGAACAAAGGTGAAAATTGGGCTTTTATTGGGCTGATCAAGGGGGAGTAAAG 
GCAGCCAGGAGCAGTCGCCTGTCATGCTTCTACCTATATTGCATGTATAGAAAAGG 
GAAAATAAACTCCTTCCTGGGCAGGGTTTTAGTATGCTAAGGAGGGGAGTTATTCA 
ACTTCAATCCAACTCAAGCATCAGCATTGCTGCGTCCATCCCAGTTTTGTTTTGCT 
GGGGCTGAACTTCTTCCTATAACTTTTTGAAACAACAAGAACTCAAGGTGTGACAG 
TTACAAGTGGGCCCTTTTTCACAGTGTGTACCTAAACACGTGAGGACCCTGGATTA 
CAGAATGACAGACTCGAAGTGACTCAAGTTCCGGTTGTTCATCTTTAGATGGTAAA 
GATGGCTGTACGTACTATCCTTGCTTATTTCCAATCTATTGTTTAAACTCTTGTAT 
AT G T AAT AC C GC AG AGGC T AG AG AT AC AAC C T T T G AC C AAAT GAG T G AAT T C AAG T 
AAT C CAT T AC T AAT G T GAT C T GGAAAC AAAC AT GG T GT T GAAT GT GC AT AT G T 

Sequence ID 1219 SEQ ID NO: 433 nt : 

559 

CTTGGCAGCTCCGTTATGTGCCCAGCTCTTTGCAAGGGCATACTGGGAAATGAGTG 
GAGATAAAGGACCCAATCATAAGCATTTTACAGTATGGATACCCCATTTTAAAAAG 
GT AAAC T GAGGC AC AAT GC AAT TTTTTTTTTTTTT T AAGGAGT T T AT T T GAGC AAA 
C AGT GAT T CAT GAAT C AGGC AGC AC C AAAC C AGAAGGAGGC T T T GC T GAANAAGGA 
TGAGGGACAAGCATTTATAAAGTGAATGTAGATGTAATACAAAGAAAATATTTGAA 
CCGGGTGCGGTGGCTTACACTTGTAATCCCAACACTTTGGGAGGCCAAGGCGGGCA 
GATCACAAGATCAAGAGATCGAGACCATCCTGGTCAACATGGTGAAACCCCATCTN 
TACTAAAAAATACAAAAATTANCTGGGCGTGGTGGTGCGTGCCTGTAGTCCCAGCT 
ACTTGGGCGGCTGAGGCAGGANAATTGCTTGAACCCGGGAGGTGGAGGTTGCAGTA 
AGCCGAGATTGCACCATTGCACTACTCCAGCCTGGTGACAGAGAGAGACTCCATC 

Sequence ID 1220 SEQ ID NO: 434 

GANNNGTGCGAT ANNATGNNT GTC T t T T T T T T AAAGTNT T TCNNATNGNAGNGAAN 
CCCCCNNANNTNNCATAANGAGAGATNACTACNGTACANATAGNGNCANACNGATA 
GTAGTANCAANATTGTNTTAGCTANATNANTCAATAGATATCNAGATANAANAANA 
NCNNGGATATACAGCGATGTNTNANNGGNNNNNNNANGGAACGAACATCNACNTTA 
ANNATAAGCTNGNGGAGAGAGACANGTANGTTATANANNAGAATNGNAGTAGGNGT 
GATCATAATAGNNNNNANNTANTATATANGATNTTANTGNNCTNTNNTNNGTTTAT 
CNNNAATNTCTATNCTNGAGAGNAGCNNNATNNNNAGGCGANGANATTGGGNNNTN 
CTCNTNATAGANANCTGGTGTCNNANAANTACNTCATCTATTNANCTCTCACNANA 
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TGGNANNATANAGNAGNGNNNTNNANAGGANTANGCATAGNGNNTNNCTNAAACAA 
AANNNATAAGANNTCTCGNNAANANGGGCCTNTNNTNTAGCGAGGNNTTANTTTNT 
ATANTTNTTCNCTCTTNNAATANNTANGATANATGANCTNGNNGTGATANATANNN 
NNTACNGTNAANNTNTANTCNTATAATAGATANAAATATAGGATNTTNCTCTGGCN 
GGTNGAANANTTNNTNCNNTTTNAATAATGNTGTTAGNGACNGNGNTNTNANANNN 
NNTTAGAAAGGTACTCTATATACTNNTATGNTNCGGCNNATAATANAACAGATGTT 
TGTATNAATATNAAANAAGGTCNNTTTCGNCAAGAGAANNNTGNCTGGTNATAGAA 
TTAGCATAANTTANNTANTATGATNNANTNNTNCTACNANTNTTAGCNNTTNGCAG 
NAGTCATTNNGNATNTATNNNGNNTANTAGTNANTTGGGNCTNNTNCAGANTATAT 
TNTGNGAANATGAANNTACGNANTCCTNNGNANTATNATNNTGANTANGANAANCN 
ANANNTNTTNTANNANTGNCTATANATTGCCNNGATANATTNTNNNAATGAANCGA 
T AG CCCGCNCT AAG G ANN T NN G T N ANN T AAANN T C T C AG AT A ANN T AC N T N T T NN T 
TATTAANCNANNATCACANTATANCNGNGACANNNGCGANANTATATGTATGNNAN 
TATNACNGNTCCNNNCCGNGAANNTANTCNTANNAGGCATTCNGNNGAGCTNTTCT 
NCTAGACNATTTNNANTGAAANNATGCNGNNAAAAACGACNNNCTTNAANTTNTGT 
CTACANTCCGCNNTNTTTNTACAGATNGCAGNTAAGNNNANTNANNGCTCTCANCT 
NGCTNNNACT 

Sequence ID - 1221 SEQ ID NO: 435 nt : 

741 

AAGCAGAANTNTCTCTAAAAACATTATCTCCTTAAAATCTTGAGGTGCATATNAGA 
GCCACAGGCAATCTCTGACATATAAAATTGCAGTACAGGCCTTTCAAATTTGGCAT 
TTCACTGGTACAATACAACAACCAAGATATATAATAACTGTACAGTGCCTAGACAT 
T CC AGT AAGAACC AT TATTTTCTT TAATGTAGAAT GAT T AAT AC AT AT T C T AC AAG 
GGGCAGTAAGGTTAGTAATTCTATAGGGTATGTCCCGACATAATTTTCAAATTGTA 
CAATAACACAAACAACTTTGTTAAGGCCATGTTTTATTTGCTGATTAATGGACAAA 
AGGCAATGTAATTTATTTTCAAGTATTTTCTTGAAAGTCTGTGCTCATAAAAATCA 
T GAAAAGT T GG AAAGAC T GT T AAAT C AC T GAAAC T T C AAAT AT AT C T T AC AC AAT C 
T T GT T T GT AC AAAAAT AC AAGT T AAAT AT AAAC AT AAAGC AAT C AT GGT AAT T T T A 
TGCAAATCTGTTTTATGTGATCATCAGTTATATATAAAAGTTTCTCAGTTCTGTTA 
TTTGTGAAAAGATCAATACCAGATTGAATGACTACCTATTGGCAAAGGGCCCTAAA 
AAGCTTACTTTAGCACTCATCTTTTACATGGTTAAATGCATTTCCTAATTTGAGAT 
C ACC T AAAC AC T GGAAAAGAAAAAAAAT GAAAGGGC AGT AT GT CC AT AAACC AAC A 
AATAATTTGGCTG 
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nt : 
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485 

CGAAATTTCCTTGTGACACAGAGGAAGGGCAAAGGTCTGAGCCCAGAGTTGACGGA 
GGGAGTATTTCAGGGTTCACTTCAGGGGCTCCCAAAGCGACAAGATCGTTAGGGAG 
AGAGGCCCAGGGTGGGGACTGGGAATTTAAGGAGAGCTGGGAACGGATCCCTTAGG 
TTCAGGAAGCTTCTGTGCAAGCTGCGAGGATGGCTTGGGCCGAAGGGTTGCTCTGC 
CCGCCGCGCTAGCTGTGAGCTGAGCAAAGCCCTGGGCTCACAGCACCCCAAAAGCC 
TGTGGCTTCAGTCCTGCGTCTGCACCACACAATCAAAAGGATCGTTTTGTTTTGTT 
TTTAAAGAAAGGTGAGATTGGCTTGGTTCTTCATGAGCACATTTGATATAGCTCTT 
TTTCTGTTTTTCCTTGCTCATTTCGTTTTGGGGAAGAAATCTGTACTGTATTGGGA 
T T G T AAAG AAC AT C T C T GC AC T C AG AC AG T T T AC AG A 

Sequence ID 1226 SEQ ID NO: 437 

GGTTTTTATACTTGCCATGAAACTGTTCTTTGGGATATTATTTTGTTCAGGTTCCC 
CACTTGGACAGCAGAGGGGGTGACTCTGCCCATCCCTGCCACTGGTAGCCAGGCGG 
GCAATGTCTGCTAGCAGTCTGCTTCTGTCTGAACTCAGCCAGCAGAGGCAAACTCC 
CGGTTCCCCGAGAAACACTCTGAAGGCAGGGTGGGTGACTCCACCCACCACCGCCT 
CTCCTAGCCATGCAGGCCATGTCTGCTAGAGCTTCCAGCGCAGTGGTCCTAATTCT 
GTCTGAATCCGGCTGAGGGGTGCAGCCTCCTGTTACTGCCCAGGGAAACACCCAGA 
TGGCAGGGTGGGTGACTCCAACCACCTCTGCCTGTGGTAGCCAGATGGGCCACACC 
TGCTAGAGCTTCCAGCCCAGCAGTCCCGCTACTCTGTGGGTGGGTGCCATCCCCTG 
TTCCTCTGGGAAGCACCCAGACAGCTGATTACGTGACCCCACCCACTTCTGCAGAT 
CCTAGCTGAGCAGGACTTGCTGGTTTGGACAATGCCCAAGCAGGGAAGAGCCCTCA 
TTCTCTTATCACTGACAGAGGTGAGATGTCCGANTTTGTANGCTGGTGGAGGAGTG 
AGGTGGAGGAGGTATGCCTCT 

Sequence ID 1228 SEQ ID NO: 438 

GTTATTCAGGTATCCATCAAAATTTTATAAGAGGGCCGGAAACATCGGCTCACACC 
TGTAATCCCAGCACTTTGGGAGGCTGAGGCAGGTGGTTCACTTGAGGTCAGGAGTT 
CGAGACCAGCCTGGCCAACATGGCAAAACCCCGTCACTATTAAAAATACAAAACAT 
TAGCTGGGTGTAGTGGCAGGTGCCTGTAATCCCAGCTATTCGGGAGGCCTAGGAAG 
GAAAATGGCTTGAACCTGGGGGTGGAGGTTGGAGTGAGGCAAGATCACACCACTGC 
ACTCCAGCCTGGGCGACAGAGCGAGACTCCATCTCAAAAGAAGAAAAAAAAAACAA 
CAAAAAAACCTTTATCAGATTATCAGAGGTTATCACTACAGAGGGAGGTAAAATTG 
G AG G G A A A AG G G T AC A A A T T T AT T T C AC 



Sequence — £©■ 



1230 SEQ ID NO: 439 



nt : 
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741 

AAGCAGAANTNTCTCTAAAAACATTATCTCCTTAAAATCTTGAGGTGCATATNAGA 
GCCACAGGCAATCTCTGACATATAAAATTGCAGTACAGGCCTTTCAAATTTGGCAT 
T T C AC T GG T AC AAT AC AAC AACC AAGAT AT AT AAT AAC T GT AC AG T GC C T AG AC AT 
T CC AGT AAGAACC AT TAT T T T C T T T AAT GT AGAAT GAT T AAT AC AT AT T C T AC AAG 
GGGCAGTAAGGTTAGTAATTCTATAGGGTATGTCCCGACATAATTTTCAAATTGTA 
CAATAACACAAACAACTTTGTTAAGGCCATGTTTTATTTGCTGATTAATGGACAAA 
AGGCAATGTAATTTATTTTCAAGTATTTTCTTGAAAGTCTGTGCTCATAAAAATCA 
TGAAAAGTTGGAAAGACTGTTAAATCACTGAAACTTCAAATATATCTTACACAATC 
T T GT T T GT AC AAAAAT AC AAGT T AAAT AT AAAC AT AAAGC AAT C AT GGT AAT T T T A 
TGCAAATCTGTTTTATGTGATCATCAGTTATATATAAAAGTTTCTCAGTTCTGTTA 
T T T G T GAAAAG AT C AAT ACC AGAT T GAAT GAC T AC C T AT T GGC AAAGGGCC C T AAA 
AAGCTTACTTTAGCACTCATCTTTTACATGGTTAAATGCATTTCCTAATTTGAGAT 
CACCTAAACACTGGAAAAGAAAAAAAATGAAAGGGCAGTATGTCCATAAACCAACA 
AATAATTTGGCTG 

Sequence ID 1231 SEQ ID NO: 440 nt : 

203 

TTGAGGAAGGGTCTACTGTCTTTTTAAATGGCACAATTTTAAGAGGTTTGAGAGGT 
ACAGTCCCTTAACCTGCCACGGGAGAGGGGCCCCCAAACTTTCTTCCCCCCACACT 
TCTGGTTTTCTGTGTGGAGGGGGAGCAGGGATATCTAAGCTGTGGTGTGAAAGGGT 
AGGAGAGATGCTGGAGGTGGGGGTGCTGTGTTCTA 

Sequence ID 1239 SEQ ID NO: 441 

TTTCCTCGGGAAGCGCGCCATTGTGTTGGTACCCGGGAATTCGCGGCCGCGTCGAC 
AT TTTTTTTTTTTTTTTTTTT T AGAAT G AT T AAC AAT T TAT T GAG T T T TAT T TAT C 
T AC AAAAAT AT AGC AAT ACAGNGAACTTCACCAAATCCT AAAT AT TCAGTACCTGA 
ACTGGCTACAACACCGNGTGCACACCCAGTTCCTGCAGAATCTCTTGCAGATATGG 
GAG AG T C AGC C AG T GAAAAG AT CCATTTCT T GGG AAT C C T T G T C AAC AAG AC C AG T 
TCAGAAATCCAGGATATATAGAAGCCTACTGTAATTTAAAAACAGTAACAAAAACC 
CC AAC AAAACCC AAAT C AAC AAAGACC AAGAT AAAGGNGT GAT AAAC ATT AAT TGT 
AATGGTTTTCCTTTACATGCAATACATGCATTTTAAAATCACTAAGAAACACGAAA 
TTTTGTAGAGCAAAGTTTGNGTTTCACGTAAGTGCAAATGAATATATATTTTATTT 
TTTATACTATTAAATTATATATATTTTTTCCATACAAAAGCACACAGTGTTAATCT 
ATAAAATGACATCCAAGTGGATGATGATTGTTTTTGCATGTCCCCCTGCTTAGATT 
TTTTTAAAATATATAGTCAAAAATTAACATCCTTCTTTAAAAATACAGAAGGGAAA 
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AANGGGCAAAAAAAAAAATCTAGACTCGAGCAAGCTTATGCATGCATGCGGCCGCA 
ATTCGANCTCGGNCGACTTGGCCAATTCGCCCTATAGNGAGTCGNATTACAATTCA 
CTGGGCCGNCGNTTTACAACGTCGNGACTGGGAAAACCCTGGCGTTACCCNNCTNA 
TCGNCTTGNAACAATNCCCNTTTNGCCAGNGGGG 

Sequence ID 1255 SEQ ID NO: 442 

TCACTTCGTATNGAANCTGTTTGGACTTGCTCAACAAGACCTTATCTTAACAAAAA 
GTAACTTATAGAAAAGGGAGACATTCATTTAACTTCAAGCCCATATTATTCTTAAA 
AGCTGACTCTTGAAATAGTATTTATTGAGTCATAGTGGAGTCATGGGACTTTTTAA 
GGGCCGGAAGGGACTATTTAGATCATCCAGTCCCACCCTGTCATTTTATGGAGGAG 
GAAACTGAGGCCTAGATAAGATAACCAGTTAGTGGGTCCACTGACCTTTAGGACAG 
T AGT C T AT CC G T AAGAGAC AAC AT GGAG AAAGAAAT AC AAC G T T T T T AT AGT GAAT 
TATCATCTTACAAAGAATATTCTTCCCATATCGCACTTTTAAAAAGTGGGTACCTT 
AGTCAAATAGGAGAAAAAACCACTTGAGTAGTTTCATCCTCAGGTTTTAGGTGAGG 
AAACTGATACTCAGATTAAATAACTTTAAGCACACAGAGCCTGAATGATAGTCTTA 
TTTGAGCTCATCTGTGCTTTTAATGTGTACTACGTTAGGTGTTTTCACTTGCATTT 
CCTTTAGTCTTATTTGAGCTCATCTGTGCTTTTAATGTGTACTACGTTAGGTGTTT 
TCACTTGCATTTCCTTGTTTGACGTTGACAATAAATCGTGAAGCTGCCTTATCTAA 
GNAGTCCTAAAGTAAATCATTGGAACACATGTANCCAGTTTGTTGTTTTTATTTGC 
C AGG TNT C AAAT AT AAC T GAAAAC C C AT GC T AAC T GAC TNAT T T T AAAAGNT GTN T 
GGGGCATGAAANGATTGCTCTGCCTGGGCGGGNGGTTNANCCTGNGTCCCCCNTTT 
NGGAGNCCACCCANGANGCGATATTTNAGGGNNGATTCNAAACCCCTGGCACGNGN 
NAACCCCNT T T T T A A AN AN A A A AN AN C G GNN G 

Sequence 1256 SEQ ID NO: 443 

TTGTGTTGGTACCCGGGAATTCGCGGCCGCGTCGACGGAGTTTTACCTTATTACAC 
T T T AAT C T C T GGAT T T AC CC C AT C T CAT TTCTCTTT T AGGAAAAC T GT T T GT AT G T 
GGTGGCTTTGATGGTTCTCATGCCATCAGTTGTGTGGAAATGTATGATCCAACTAG 
AAAT GAAT GGAAGAT GAT GGGAAAT AT GAC T T C AC C AAGGAGC AAT GC T GGG AT T G 
CAACTGTAGGGAACACCATTTATGCAGTGGGAGGATTCGATGGCAATGAATTTCTG 
AATACGGTGGAAGTCTATAACCTTGAGTCAAATGAATGGAGCCCCTATACAAAGAT 
TTTCCAGTTTT AAC AAAT TTAAGACCCTCTCAAACTAACAGGCTTAGTGATGTAAT 
TATGGTTAGCAGAGGTACACTTGTGAATAAAGAGGGTGGGTGGGTATAGATGTTGC 
TAACAGCAACACAAAGCTTTTGCATATTGCATACTATTAAACATGCTGTACATACT 
TTTTGGGTTTATTTGGAAAGGAATGCAAAGATGAAGGTCTGTTTTGTGTACTTTTA 
AGAC TTTGGTTATTTTACTTTT T GGAAAAG AAT AAAC C AAG AAT T GAT T GGGC AC A 
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TCATTTCAAGAAGTCCCCTCTCCTCCACATTTGTTTTGCCAATTTGCACATTAAAT 
GACTCTTCCCTCAAATGTGTACTATGGGGTAAAAGGGGTAGGGNTTAAANATGTAA 
ACAGTTGGGTTTTTTAAGGGNCCTTTTTCATAACTGGAACACTCTNTACAAGGNTN 
C T TN T T AAAT AAAT AAC T T GAC TTTTTTGTTT TNT AAANGNANC T T CN T GC T T CCA 
T A A A A A A A A A A A T T T A AN T N G N C AN CTNTGCTGCTGCGNCC AN TTNGCTNGNCCNT 
GGCATTCCCTAGGGANGNTNAATANTGGCNNNTTAACNNGGCNGNAACNNNNNCCA 
NT 

Sequence ID 1331 SEQ ID NO: 444 

GGGCGATGCATGCTTTATTAAGGCTCTTGTTTCACCTGGCAGTGTACTGTATCAAC 
G T AT AAT AC AG AAAAAAAAT C T C T T T AAGGT CC T C C T T C AC AAAG AC AT AGAGT G A 
AAC T CC C T T T AC AT G T C AGT AT T T G T T C AAC AC T T T AGGC AAC T T GAC T GT C AGT G 
T T AAAAT GGAAAAC AGGAAAAT GGAAAAAT C T GAC C AAT TCTGCCACCTT GAG AC T 
TTCATATAGACCTTGCACAACAATTGTATAGATCACACACCGGCTGTATTTAATAT 
GTAACATTTTCACACATATTAAAGATACAGAAGTATTAAAAAACCCCCAATGTTAA 
TGTATTTGCTTAAAAGGCACAAGTTTCACATATCTGTCTAGCTATCTGTTGGTAAT 
ACAGAAAGTATACTACTTTTTTAAAAAAGTGGGCAGAATTCTTGTGTATGTATATT 
TGTGTGTACAGTATGTGTATGTGTGTATATATATATATTATATATATAGATAATAT 
AT AAAT AT TTTTTT T AAGGAGAAAC T AG AAT G T T T AG C TAG AAAAT T C C AC AGC C T 
G T G A AG A A A T A T T T C A A A A T G G C C A T AAAG G AG G T A A A A A T G A A A AC CAT AAC C T A 
ACTTTTATAGAGGCTTTATCTTTAATTTAACGATGTGCGGAGGACTTTCTTGCTTG 
AATCTGTTCCGGGCTGTCTGCTCTGTCCATCAAATGGGCAGGTCTGGGAATGAGGC 
ACCTTCGGCCGTTCAGAAGTGGCCTGAACAGAATGCTGGAACCCAGGCTGGACTCG 
GAC 

Sequence ID 1332 SEQ ID NO: 445 

C AAACC T GC AT GT T C T GC AC AT GT AT C C AGGAAC T T AAAAAAAAAAAAAGAT AGT T 
TGTGTGTCTTAATTGAATAATAGTAGATTTATAGATTAAAGATCTATGGGTTTTTA 
ATATGGATTAGAAATCTGTGGGTTTTTGATATGGATTAGAAATCTGTGGGTTTTTA 
ATATGGATTGGAAATCTGTGGGTTTTTAATATGGATTAAAAAACATCTGTGGGTTT 
TTAATATGGATTAAACATCTGTGGGTTTTTAATATGGATTAAACATCTGGGTTTTT 
AATATGGATTAAACATCTGTGGGTTTTTAATATGGGTTAAAAATCAAAAGAAAATG 
AAC T AT T TGCTCC AGT GC AGGAAAAT AC AGGC AAT AC TGGAT AC AAT TAGATGGTC 
AGGAGCGATAACCCGGTTGCCATTGTTTGAAGAAGAGAATAAGGTGCTAGCATTCC 
TATCCGTAGATAATTTGACAGCTAGGAAATAGGGGGAGTCTTCTATGTAGTTAGTG 
AAGGCTAAATGAACTATTATATGCAGTTATCGTAGAAGAGTACTCAAAAAAATCTG 
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TAAAAAATAAAGAAAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTG 
GGAGGCCGAGGCGGGTGGATCATGAGGTCAGGAGATCGAGACCATCCTGGCTACCA 
NGGTGAAACCCCCGTCT 

Sequence ID 1335 SEQ ID NO: 446 

C AAG AC T CC AT C T C A A A A A A A A A A A A A A A T C T AC AGT GC T GAGT AT AT AAAAT TAT 
TAACACATTTCACAACAATATGTGTTTGTGGAGTTAAATATTTTTTGTCTTTAAAA 
CAGGTAATTTTAGTGCATACTTAATTTGATGATTAAATATGGTAGAATTAAGCATT 
TTAAATGTTAATGTTTGTTACATTGTTCAAGAAATAAGTAGAAATATATTCCTTTG 
TTTTTTATTTAAATTTTTGTTCCTCTGTAAACTAAAAGAACACGAAGTAATTGGTC 
ACAATTACTGGTGTTTAACTGCCAAATATGGGTAAATAAGGGAAAATTTTGTTTAA 
TATTTAGTCCTTCTGAGATGGCTTGAATATTTGAATTTTGTTGTACGTCTATACTG 
GGTAGTCACAAGTCTTATAAACACTTTAGAGGAAAGATGGATTTCAGTCTGTATTT 
T T AAAC AT CAT T T AT T T TAAAT CTGGTGCT G AAAAAT AAG AAAAAAAT TAAAC T GC 
ATTCTGCTGTTCTTCTTTAGAAGCATTCCTGCGTAAATACTGCTGTAATACTGTCA 
TGCAAAGTGTATCCTTTCTTGTCGTATCCTTTTTGGGGCAGTGTTTTTTTGTTTTT 
TTCCTAGAAATGTTTGTCCTTCCCCCACCTGTTGATCCAGGTTAAGGAATACTTTT 
TTACACTTTATTCAAA 

Sequence ID 1336 SEQ ID NO: 447 

CTTTTCCTCCCGCTGTCCCCCACGGAGGGGACTGCTCTCCCCCGCTGCATCCTTTC 
TGTGAGGTACCTTACCCACCTCAGCACCTGAGAGGGTGAAATAGAATTCTAACCTC 
GACATTCGGGAAGTGTTTTTGAGAAGTCTCGGTCGGTAAGGGAAGTCTTCCAAGTC 
CGTGCAGCACTAACGTATTGGCACCTGCCTCCTCTTCGGCCACCCCCCAGATGAGG 
CAGCTGTGACTGTGTCAAGGGAAGCCACGACTCTGACCATAGTCTTCTCTCAGCTT 
CCACTGCCGTCTCCACAGGAAACCCAGAAGTTCTGTGAACAAGTCCATGCTGCCAT 
CAAGGCATTTATTGCAGTGTACTATTTGCTTCCAAAGGATCAGGCCCTGAGAACAA 
TGACCTTATTTCCTACAACAGTGTCTGGGTTGCGTGCCAGCAGATGCCTCAGATAC 
C AAG AG AT AAC AAAG C T G C AG C T C T T T T GAT G C T G AC C AAG AAT G T G G A T T T T G T G 
AAGGATGCACATGAAGAAATGGAGCAGGCTGTGGAAGAATGTGACCCTTACTCTGG 
C C T C T T G AAT GAT AC T G AGG AG AAC AAC T C T G AC AAC C AC AAT CAT G AGG AT GAT G 
TGTTGGGGTTTCCCAGCAATCAGGACTTGTATTGGTCAGAGGACGATCAAGAGCTC 
ATAATCCCATGCCTTGCGCTGGTGAGAGCATCCAAAGCCTGCCTGAAGAAAA 



Sequence ID 1337 SEQ ID NO: 448 

CAAGAACTCTGGGACATTTGCAAAGGGTATGGCATATGTGTAATGGGAATACCAGA 
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GGAGAGGAAAGACAGGAAGTC AAAAAAAGAAT T T TTCCAAAT T AATGAT AGGT TCC 
AAACCACAGATGCAGGAAGCTTAAACACCAACAGGATAAATAAAACAAAATCTACG 
C T T AAGC AT AT CAT AC T T AAC C T GC AGAAAAT T AC AGAC AAAGAAAAAAC AC C AG A 
GGGGAAGCTGGCAGAAACATACCACCTATAGCGGAAGAAGAATAAGAATTACATCA 
G AC T T C C C T T C AG AAAT C T T GC AAAC AAAAAG AT G T AGC AC AAT AT T T AAAG T AT T 
AAAGGAGGCCGGGCCCGGTGGCTCGGGCCTGTAATCCTAACACTTTGGGAGGCTGA 
GGCAGGAGGACCATGAGGTCAGGAGATCGAGACCATCCTGGTGATGGTGATACCCC 
ATCTCTACTAAAAATACAAAAAATTAACCGGGCATGGTGACACGCACCTGTAATCC 
CAGCTACTTGGGAGGCTGAAGCAGGAGAATCGTTTGAGCCCAGGAGGTGGAGGTTG 
CAGTGAGCCGAGATCACATCACTGCACGCCTGGGCAACAGAGCGAGACTCCATCTC 
AAAAAA 

Sequence TP 1338 SEQ ID NO: 449 

CGACCCGTTTTAGTCAGGATGGTCTCGATCTCCTGACCTCGTGATCCGCCTGCCTC 
GGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCGTAAATCAG 
GTTTTTTAAATGTTTGCCAAACCTTATCACTGACTTTTATAACAAAATTATTTACT 
ATAATCATTAGGGAATATTTAAGTTCTGCTAATACTTAAAATTGCAGAGTGCTAAA 
ACCAGCAGTGAGTTTAGAATCAAGCTAAGCTTTATTGTTGCTACTATTTGAGGCAT 
ATTAGTTGACTGGTGTTCATATGCAAGGCAGTCTACTGGGTGCAACAAGGGTTAGA 
AG GAT AT T T T T AAAAAAC T GACC C T AT T C T C AGGAT GAAAAT AAT AC AC T AG T AAT 
AGTCTGCTCTGTTGGTTAACTCCTCGTAAGGAGGTACAATTAAAATGCTGTAGTGT 
T GC AAGGGAAGGAGAGGAAGAAT CAT AT T CC T T C AC T AGC AGGAT C AAGAAAGC T T 
TTATAGAAATATACAAAATCTTCACTTCTTGAAGGATTGGTAAAATTTAATAGCCA 
ACATTGGGCACTTATTCATTCTCTGAGTAAATATTTATTGCATGCTTATCTTGTAT 
CAAGCATTGTGATGAAAGCACAAGAATGAAAGAGGAGGGAGAATGTTTAGAGAATA 
AGGGCTGAAACACAGATTTTGTAGGGAGCGTAGGGGAGACTGANAAGACAGGTTCA 
GGTTAGTAAGGGCGCTCATATTTTGACCCTGAATGTTAACTATGTGCACATCATGC 
T AGC TAT T C T AAAT C AGGC AT T T T C AAAT GGAAGC AGGC AC T GAC AT T T T 

Sequence ID 1344 SEQ ID NO: 450 

CGTGAAGGGTCTTTATGTATTAGTATTAGAGTGATCTTTTGATTATTTTCCTCACT 
ATAAGGAAATTATTTCCTCAGGATGAGCTGCCATAACATTCCACTGTCTGATGGCA 
ATTTTAAAGCCTGAAATTGAAGCCCATGGCTAGGCTATGAGAACCCTAGTTCGTAT 
AGTAAAGTTGATATCTTCTGGATGTATACTAATTTTAGGCTTTATTTTAAAACTGC 
TGGAAAC T GAAAC T T AGAC AAAAGT AT T T T C AGGAC AT CAT T T AC AAT G T T T AGC C 
C T AAAGAGT C AAGC T GT GGGAT T C T GAG T C T T T CAT AT GT T AC AGC AG AAAC T T AA 
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AAGCAAGAGGAAATTGGCTGGGCACAGTGGCTCTGTAATCCCAGCACTTTGGGAGG 
CTGAGGTGGGTGGATCATGAGGTCAAGAGATTGAGACCATCCTAGCCAACATGGTG 
AAACCCCATCTCTACTAAAAATACAAAAATTAGCTGGGCGTGGTGGCACACGCCTG 
TAATCCCAGCTAGTCAGGAGGCTGAGGCAGGAGAATATCTTGAACTTGGGAGGCAG 
AGGT T GC AGT G AGC C AAG AT T AC AT C AC T GC AC T C C AGCC T GGT G AC AGAGC GAG A 
CTCCGACT 

Sequence ID 1348 SEQ ID NO: 451 

C T GAAAC T GC AC T G AAC C C AC AGG T AGG T T AC AT C AC AGG AC AG AAAT C T G AGG AG 
CTGGAGAAAGCAAAAGAATAAAGGATGGGCTGACACCAGAAGGAATTAAAGGAATT 
T T TAT AC T GAAC T T C AAT T AC T T GT T CAT T T GAAG TTTGTTTTTT T AAT GAACGT T 
TTTGCTGTTACTTAAATATAGTGTTTTGAAAGTGTTTCAAATGTATTCAAGTTGGG 
ATTTTCCATATTTTACTACAGTTCTGTCTTAGTATGTTCACCATAAAACACTTATC 
ATTAAAGCTCACAAAGTGCTTTTTTGTAATATGAGGATAAAATGAAGCCATATAAG 
AATTTTTTTATATCTGTACATTTAACCCACATTTGAGCTTTAGCCAAAATATATAG 
CTTTTTTTTTTCTGACCTGGCCAACGTATTATCCAGCAAACATCAACTGAAGCAAT 
ATGGAAACACTTCCAAATGTTTGCCAATAATGCTATTAAGTGACTGATGTCAACAT 
T AGT T AC AT GGC AAAC T AAAGAGGC AT T AT AC AT T T T T AAAAC AC AC T AAC AT AT A 
ACTGTAGATAATGTAAGGTTTATTTATATGCATATTTCATAGTATATTTAAATGTT 
T AAAT AT AAAAAAGGGT T T T T AAAC AC T T T T AAT T T T T AT C T T T GAT t T T T T T TAT 
T GAT ATC T CT TTCCAGGC TACT AAT AAAATTGCCAGAACT AAAC TATC AGGT AAAG 
GTTAAGGCATCAATTGACAAGTAAGTTTTCTAATTTCGTTTTGAATTACAATTCCA 
AATGTAAGACTTTTAAAAATGAATGGCCTTTATTTTATAGAATAATTTTGACCTTT 
TAAATTTACTTATCTAACATTATATAATGAATGTACTTCAAATATTTGACTTTGAA 
GTCAACATTAACAAATTCATGGATCCTAATTAAAATTTACTATAAAACTGGAATCA 
TTTATTACTTCCTT 

Sequence ID 1351 SEQ ID NO: 452 

TTTTTTTTTTTTTAAAAGAGATGGGTTCTCACTATGTTGCCCATAATGTTTATGAG 
ATTAAGTTCATCTTTTTTATCTGAGTAGTATTTTATTGTATGAATATACCACCATT 
TATTTATCTGTTGGTTATTTCCAGTTTTGGGCTATAATCCAAAATGCTTTTTTCAA 
ACAATAGGCTATATATCATTAATGTCCGTTTATCAGCAGTATAAAATATCTTACCA 
T AAAT AT T AAT AAAAGAAGC AT T C AT AT AT AAAAT AT AGAT AT T T C AAACCC T AC A 
GAGGGCCTTTTAATGATTAAATATTTTGTCCTTACAAAAAGGTCCAGGTAATTACA 
CCCATGAGGTTAACCTGCCTT AGT GCAGGACTT AAAAT AAGGCTTCTCCTGCCATC 
TCTCTCCATTTGTAGAATGTGAAATTCTTTAAAATGCATCCTATATTAGGAATACT 
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ATAGCTGTGCACTGGTGTTTGTTCTCTTCTTTAAACTCGGGACCGTATATATCTGC 
TCAAATTGCCCAAGTATACATATGCTGCACTCCATCAAGTGTCAGGCCACATTCTA 
TCAGCACAGCGTGACTGCCTATCAGTGACAATATAAGTGAGCTCTATTTGGATCCC 
TCTTACCCTACCTTTTATATTTATGACAGCATTATCATAAAACTCCAATATTCTTC 
AAT AAC T T AC AT GT T T GT T GT AGGAT AAAAT T AT T AC C C T C AAT G AAC T AC AT 

Sequence ID 1352 SEQ ID NO: 453 

ACCAGCTTCTTCACAGGTTCCACGAGTCATGTCAACACAGCGTGTTGCTAACACAT 
CAACACAGACAATGGGTCCACGTCCTGCAGCTGCAGCCGCTGCAGCTACTCCTGCT 
GTCCGCACCGTTCCACAGTATAAATATGCTGCAGGAGTTCGCAATCCTCAGCAACA 
TCTTAATGCACAGCCACAAGTTACAATGCAACAGCCTGCTGTTCATGTACAAGGTC 
AGGAAC C T T T G AC T GC T T CC AT GT T GGC AT CTGCCCCTCCT C AAG AGC AAAAGC AA 
ATGTTGGGTGAACGGCTGTTTCCTCTTATTCAAGCCATGCACCCTACTCTTGCTGG 
TAAAATCACTGGCATGTTGTTGGAGATTGATAATTCAGAACTTCTTCATATGCTCG 
AGTCTCCAGAGTCACTCCGTTCTAAGGTTGATGAAGCTGTAGCTGTACTACAAGCC 
CACCAAGCTAAAGAGGCTGCCCAGAAAGCAGTTAACAGTGCCACCGGTGTTCCAAC 
T G T T T AAAAT T GAT C AGG G AC CAT G AAAAG AAAC TTGTGCTT C AC C G AAG AAAAAT 
AT C T AAAC AT C GAAAAAC T T AAAT AT T AT GG AAAAAAAAC AT T GC AAAAT AT AAAA 
T AAA T AAAAAA AG G AAAG G AAAC T T T G AAC C T T A T G T AC C G AG C A AAT GCCAGGTC 
TAGCAAACATAATGCTAGTCCTAGATTACTTATTGATTTAAAA 

Sequence ID 1353 SEQ ID NO: 454 

ACATTCTGGAAAAGGCAAAAGGGAGGAAGAACTGATTAGTGGTTAGCCCAGGGTTA 
GAGTTGGGGAGAGGATATAATGAGGGAACTTTTGTGGATTCTGTACCATGATTATG 
ATTACACAAACCTATGCATACATTGAAACACATAGAACTATACGTTGAAAAAAGTG 
AAT CTGCCTGTAT G T AAAT T T AAAAG AAAAAT AT T T T T T T A AAA AAAC AG AT GC T T 
CTTAACACATTATCATCTATGTCAGTTTAACAGTTAGTAGACTTAGGCCAGGTGTC 
ATGGCTCACTCCTGTAATCCCAGTGCTTTGGGAGTCTGAGGTGGGACGATCTCTTG 
AGAC T AGGAGGGAGT T T G AGAC AAACC T AGGC AAT GT AAT GAGAC T C T T T C T C T AC 
AAAAAATTTTAAAGTTATCTGGACATGGTGGTGCCTGCCTGTAGTCCCAGCTACTT 
GGGAGGCTGAGGTGGGAGGATTCCTTGAGCCCAGAAGTTCAAGGCTACAGTGTGCT 
ATGATAGAGCCACTGCACTCCAGCCTGGGCAACCAGGTGAGACCTTGTCTCTAAAA 
TGAAT AAAT AAAT 



Sequence ID 1355 SEQ ID NO: 455 

TGGTCTTTCACCCAGCCAGGGAGAAGGTTCTTCGCTCAGTATGAAGAAAAGCAACC 
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CAAAACTCTCAATCTGATTTGTTTTTGTTTATGTCGATGCCCTGTAGTTTGAAAGT 
GAAGTAAAGATTTAGAATTCACCTAAGTCCAAAGGAAAACACGTGGTTTTTAAAGC 
C AT T AGGT AAAAAAAGT T C T C AAT AAAGGC AT T AC AAT T T T T T AGGT T TAG A A AG A 
T GGAC T T T T C T GAT AAAT C T T GGC AGAC AT C T AAAAAAAAAACC AT AT T T T T C AC A 
AGAAAAT GC AAGT T AC t T T T T T T GGAAAT AAT AC T C AC T GAT T AT GGAT AAAAT GG 
AATATTTTCAGATACTATATTGGCTGTTTCAAAATAGTACTATTCTTTAAACTTGT 
AATTTTTGCTAAGTTATTTGTCTTTGTTGTATCTATAAATATGTAAAAAATATTTA 
AATAGATGTACCTGTTTTGCTTTCACACTTAATAAAAAATTTTTTTTTGT 

Sequence ID 1359 SEQ ID NO: 456 

CGGGATCCCTAGTATAACACATTCAGTGTTCCCCTTTCAGTCTTACTACTTTGACC 
GCGAT GAT GT GGC T T T GAAGAAC T T T GC C AAAT AC T T T C T T C AC C AAT C T CAT GAG 
G AGAGGGAAC AT GC T GAG AAAC T GAT GAAGC T GC AGAACC AACGAGGT GGC C GAAT 
CTTCCTTCAGGATATCAAGAAACCAGACTGTGATGACTGGGAGAGCGGGCTGAATG 
CAATGGAGTGTGCATTACATTTGGAAAAAATGTGAATCAGTCACTACTGGAACTGC 
ACAAACTGGCCACTGACAAAAATGACCCCCATGTGAGTATTGGAACCCCAGGAAAT 
AAATGGAGGAAATCATTTGCCTTAGGGATTGGGAAAGCTGCCCACTAACTGTCTTC 
CCCATTGTTTTGCAGTTGTGTGACTTCATTGAGACACATTACCTGAATGAGCAGGT 
GAAAGCCATCAAAGAATTGGGTGACCACGTGACCAACTTGCGCAAGATGGGAGCGC 
CCGAATCTGGCTTGGCGGAATATCTCTTTGACAAGCACACCCTGGGAGACAGTGAT 
AATGAAAGCTAAGCCTCGGGCTAATTTCCCCATAGCCGTGGGGTGACTTCCCTGGT 
CACCAAGGCAGTGCATGCATGTTGGGGTTTCCTTTACCTTTTCTATAAGTTGTACC 
AAAAC AT C C AC T TAAG T T C T T T GAT T T G T AC CAT T C C T T C AAAT AAAG AAAT T T G G 
TACC 

Sequence ID 1360 SEQ ID NO: 457 

TGCGCAGACCAGACTTCGCTCGTACTCGTGCGCCTCGCTTCGCTTTTCCTCCGCAA 
C CAT G T C T G AC AAAC C C GAT AT GGC T GAG AT C GAG AAAT T C G AT AAG T C G AAAC T G 
AAG AAG AC AG AG AC GC AAG AG AAAAAT C C AC T GC C T T C C AAAG AAAC G AT T G AAC A 
GGAGAAGCAAGCAGGCGAATCGTAATGAGGCGTGCGCCGCCAATATGCACTGTACA 
TTCCACAAGCATTGCCTTCTTATTTTACTTCTTTTAGCTGTTTAACTTTGTAAGAT 
GCAAAGAGGTTGGATCAAGTTTAAATGACTGTGCTGCCCCTTTCACATCAAAGAAC 
TACTGACAACGAAGGCCGCGCCTGCCTTTCCCATCTGTCTATCTATCTGGCTGGCA 
G G G A AG G A A AG A AC TTGCATGTTGGT G A AG G A AG A AG T G G G G T G G A AG A AG T G G G G 
T G G G AC G AC AG T G AAA T 
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Sequence ID 1361 SEQ ID NO: 458 

TATAAATACACTCCGGGATGATTTACCCCCGGAGGTCAGCTAGTAAAATACATGAG 
T AG AAT T C C T T AAAG T AT G T G AT AAT T GC T C AT C AC TAT C C AAG T G T G AC AT AAAT 
C AT AAAAAG AAT T G AC AAAAT CAGGGT C GC AAAG AG AAT T G AAAAAAAT C T G T C AC 
AACCAAAATTTAAATTGACCTCTGTCCTAGAGTATGAGAGCCACACTGAACAGAAA 
AACCAGATAAATCTTTTATAAAATATTCATTTGCAGCCCCATTAACGTTGCTTGTC 
ACCCCACCTCCCCATGTCCTTGGACAAACTGAATGTATAGTAACATCATCCCAGGC 
CAGGCGCGGTGGCTCATGCCTGTAATCCCAGCACTTTGTGAGGCTAAGGCAGGCAG 
ATCAGGAGGTCAGGAGTTCAGGACCAGCCTGGCCAAAAAGGTGAAACTCCGTCTCT 
ACTAACAATACAAAAATTAGCTGGGTGCGGTAGTAGGCGCCTGTAATCCCAGCTAC 
TCGGGAGGCTGAGGCAGGAGAATTGCTCAAACCCGGAAGGTGGAGGTTGCAGTGAG 
CTGAGATCGTGCCACTGCACTCCAGCCTGGGTGACAGAGCAAGACTCTGTCTCGGG 
GAGGGGGGTGGCGGAGATAAAGAAATAACATCATCTTATACTGTCAAGCTCAAGGT 
GTCTGCAGCCTTATCTTCAGGGGAAGTTGTGTCTTTCTCAGGGAAGATACAGATTT 
C AAT T T AG AGC AAG AC AG AG AG AAG T T AC AT T C AG AG AGG AAAAT GC AG T AG T C T A 
ACTG 

Sequence ID 1364 SEQ ID NO: 459 

GCGGCCGCGC T CT T T TC AAT T T T T AAAAAGAAGT T TGT T T T CCAT T TC AGT AAT T T 
CTGCTTTGATCTTCCTTATGTCCTCCTATTGAGTTGATCAGCTTTCTTTATTCTTG 
CCTTTTCTCCTCTGTGTGCCCTTTCTATTAACGTATTTACCCTTAGGCTGGGCACA 
ATGGCTGATGCCTGTAATCCCTGCACTTTGGGAGGCCGAGGCAGGTGGATCACCTA 
AGGTCAGGAGTTCAAGACCAGCCTGGCCAACATGGTGAAACCTGGTCTC TACT AAA 
AACACAAAAATTAGCCAGGCATGGTGGTGTGCACCTGTAATCCCAGCTACTCAGGA 
GGCTGAGGCAGGAGAATTGCTTGAACCTGGGAGGCGGAGATTGTGCCAAAGCACTC 
C AGC C T GGGCAAC AAAAT GAG AC TTTGTGTC 

Sequence ID 1365 SEQ ID NO: 460 

CACCAGGCTGTCTTCAGATACTTCATACAGAAATGAGCCTCCCTGTGGGGTCCTCT 
TCCCTCCTTCAGCCTGTCCATCAACACAGCATTGCGGGATCCTTACCATGGCATCC 
AGCCCTGGAGATGCTTCAGGAAAGTTGCAGGTCCATGCTGCAGGACAGGCTCAGAT 
CAGCAGAGACGCATCTCACATCGGGCTGTGAAATTCAAGTTGAGCTGCAATTGGCA 
ATGAGAA 



Sequence ID 1366 SEQ ID NO: 461 
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GTTATTCACTGAGACCGTGCCCCGGTTATGAGGTTGTACCAGAAAGCAAGTATTCA 
CTATGCACACTATTCACCGCTCACCCTAGCATTGAAGCCAGCCTGTAGCCTGAAAG 
CCTTTGCTTT GAG G G C AG GTCTTTCCCC AAAA T G C AG AC AC G AAG G T G C AAAG T G A 
AGCTGCCAGTCTTGCAAAAGATGTAACTTGTCACGAAGGCCACGAGTGGCAGGGAG 
AGCTGTCCCACATTTGCGGAAGTGGCTATGTGAGGACGGGGGAGGCGGGTCCCTTA 
GAGATGAGACAATCATAAGGGGAGATATCAGAGAAAATCGTAAGGGGAGCAGATGG 
TTGTCAAGAGAATAGGCTGACCATCGAAGGACTGGCAGAAGCTTTCAGAAAACCAC 
TGGACGGCTGGGCACAGTGGCTTAGGCCTGTAATCCCAGCACTTTGGGAGGCTGAC 
GCAGGTGAATCACTTGAGGTCAGGAGTTCCAGACCAGCCTGGCCAACATGGTGAAA 
CCCCATCTCTACAGAAAATATAAAAATTAGCCAGGCGTGGTGGCACAAGCCTAGAA 
TCCCAGCTACTTGGGAGGCTGAGGCAGGCGAATGGCTTGAACCCAGGAGTCAGAGG 
CTGCAGTGAGTCGAGATTGTTCCACTGCACTCCAGCCTGGGTGACAGTGCAAGACT 
C C T T C C A A A A A A A A A 

Sequence ID 1367 SEQ ID NO: 462 

TTCGTGAGTGATGGCGTCCCGGGTTGCTTGCCGGTGCTGGCCGCCGCCGGGAGAGC 
CCGGGGCAGAGCAGAGGTGCTCATCAGCACTGTAGGCCCGGAAGATTGTGTGGTCC 
CGTTCCTGACCCGGCCTAAGGTCCCTGTCTTGCAGCTGGATAGCGGCAACTACCTC 
TTCTCCACTAGTGCAATCTGCCGATATTTTTTTTTGTTATCTGGCTGGGAGCAAGA 
TGACCTCACTAACCAGTGGCTGGAATGGGAAGCGACAGAGCTGCAGCCAGCTTTGT 
CTGCTGCCCTGTACTATTTAGTGGTCCAAGGCAAGAAGGGGGAAGATGTTCTTGGT 
TCAGTGCGGAGAGCCCTGACTCACATTGACCACAGCTTGAGTCGTCAGAACTGTCC 
TTTCCTGGCTGGGGAGACAGAATCTCTAGCCGACATTGTTTTGTGGGGAGCCCTAT 
ACCCATTACTGCAAGATCCCGCCTACCTCCCTGAGGAGCTGAGTGCCCTGCACAGC 
TGGTTCCAGACACTGAGTACCCAGGAACCATGTCAGCGAGCTGCAGAGACTGTACT 
GAAACAGCAAGGTGTCCTGGCTCTCCGGCCTTACCTCCAAAAGCAGCCCCAGCCCA 
GCCCCGCTGAGGGAAGGGCTGTCACCAATGAGCCTGAGGAGGAGGAGCTGGCTACC 
CTATCTGAGGAGGAGATTGCTATGGCTGTTACTGCTTGGGAGAANGGCCTAGAAAG 
TTTTGCCCCCGCTGCGGCCCCAGCANAATCCAGTGTTGCCTGTGGCTGGAGAAAGG 
AATGTGCTCATCACCAGTGCCCTCCNTTACGTCAACAATGTCCCCCACCTTGGGAA 
CATCATTGGTTGTGTGCTCAGTGCCCGATGTCTT 

Sequence ID 1368 SEQ ID NO: 463 

CAGTGAGCCAAGATCACACCACTGCACTCCAGCCTGGACAACAGAACGAGACTCCA 
T A T C A A A A A A A T T AAA T T AAAA TAT A A T AAA TTTCTTGCCGGGCGCAGTGGCTCAC 
ACCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAAGTCAGGAGA 
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TTGAGACCATCCTGGCTAATACAGTGAAACCCCGTCTCTACTATAAATACAAAAAA 
TTAGCTGGGCATGGTGGCGGGCGTCTGTAGTCCCAGCTACTCAGGAGTCTGAGGCA 
GGAGAATGGTGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGTGCCACT 
G C A A TCCAGCCTGGGCAG C AG A AC GAGACTCCATCTC AAA T AAA T AAA T AAA T AAA 
ATGAATTTCAGCTAGAAGAGCCTTATTCCATTTTCCTTTTTATTAAACATCTGGCA 
TAAGTTGGTAAGTATGTGAAGTTTATCATATATTCTTATGCGAATTATTATTTTCG 
CCTTTTTTTTTATAATTCTGTCTGGGATTTGAATAGTAGAGTTTGAATTCAGGAAG 
G AC AC C T G T G AT AGG AC AAT AAAAT 

Sequence ID 1369 SEQ ID NO: 464 

CTGATTGCAAAAACATTACAACTCAGTACTGCGGCTTTCATTCAAATAGGTTATAT 
G T A T A A AC TGAGGTTC A AC A A TATTGTATTTGAGATGG G A A A G T T A A AG A A A T G C A 
AT AAT G T AAA T AAT AC T T AAG AAAAT AAG AT C T C AG G AAAC TGTGTATACTCTGTA 
CTTTTATGCAACTTTATCAGATCATTTCAGTATATGCATCAAGGATATAGTGTATA 
TGACATGAACTTTGAGTGCAAAAACTGTACTATGTACCTTTTGTTTATTTTGCTGT 
C AAC AT C T AAA T AAA G G T T T T T T T G 

Sequence ID 137Q SEQ ID NO: 465 

CGAAAGGACT ACAGAGCCCCGAAT T AAT ACCAAT AGAAGGGCAATGCT T T T AGAT T 
AAAAT G AAG G T G AC T T AAAC AG C T T AAAG T T T AG T T T AAAAG T T G T AG G T GAT T AA 
AAT AAT T T GAAGGCGAT C T T T T AAAAAGAGAT T AAACCGAAGGT GAT T AAAAGACC 
TTGAAATCCATGACGCAGGGAGAATTGCGTCATTTAAAGCCTAGTTAACGCATTTA 
CTAAACGCAGACGAAAATGGAAAGATTAATTGGGAGTGGTAGGATGAAACAATTTG 
GAGAAGATAGAAGTTTGAAGTGGAAAACTGGAAGACAGAAGTACGGGAAGGCGAAG 
AAAAG AAT AGAT AAG AT AGGGAAAT T AG AAG AT AAAAAC AT AC T T T T AGAAGAAAA 
AAGATAAATTTAAACCTGAAAAGTAGGAAG 

Sequence ID 1371 SEQ ID NO: 466 

G T C C AGN AG AAAG T T C AG T G AC T T G T C C AG AG C T G C AG G T C T T AAG AG G C T G AAAT 
CTCGCCTCTGCCTCGAGGCTGCGGTTCCACTGACCCATACTACTTGCCTTCAGGAA 
AGAGAAATGGTGTAGGAAGGCTGTGGATGAAGACGCTTACATTCATGAAGGATTTG 
GATAGGCGAACATGAGCTTTTCCACCAAATTTCAGAATTTTAAGAAATGCCTTAAA 
TTATTTCTTAAAAATCAATTTGGGGCAGACGAGAAGTTCTGATAATAGTTTTTAGG 
GAACATGAT AAAAT TCTGACCTTAGAAGTGGT AT ACCAGTTTGAGAAGAAGAACAA 
G C T AT AAAC G G T G T AG AT AAC A T T C AC GGCTATTT AAG AAAG AG T T AC T AAG G G AA 
AC C AG AAT G AC T T AAG AG TGTTACTCTTCTTTTTCT G AG AGAAC AAT AGC AT CAT C 
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TCAGAAAGCCTTTCATGCCATTAATAGGTAAGAATCTGGGCTTCTTGGACCATGGG 
TTAGACTTTCTTACAAAACCATAATATGCATTTCCTAGCAAAATTTATGCTATTAC 
AT T T C C T TAT C T C AAC AAAG AC T GG T AAAT T C AG T AC T TAT T C C T C AAT T T T C C T A 
CCCTTAAAATGGGGATATTCTGCCTCTCCAAGGAATGCTGGGAACAAGCAAGTCCT 
CATGTTAGGGGTCTTTGAGTTTTCATGGAAGTTTAGGTTATTTATATGATGACATA 
GTTGTCAACT TACT TTCAGGATGGACTTTTCTTTTGTGAGTTTGTGACCT AAAT AC 
AATAGTTGTTATGCATGTCCAGTTTATGGAAGTACCACTGCAATANCAG 

Sequence ID 1372 SEQ ID NO: 467 

CAGTGCAGCCAAGTATCACACCACTGCACTCCAGTCCTGGACAACAGAAACGANTA 
CTCCATATC A A A A A A A T T AAA T T A A AN GAT A A T AAA TTTCTTGCCGGGCGCAGTGG 
CTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAAGTCA 
GGAGATTGAGACCATCCTGGCTAATACAGTGAAATCCCCGTCTCTACTATAAATAC 
AAAAAATTAGCTGGGCATGGTGGCGGGCGTCTGTAGTCCCAGCTACTCAGGAGTCT 
GAGGCAGGAGAATGGTGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGT 
GCCACTGCAATCCAGCCTGGGCAGCAGAACGAGACTCCATCTCAAATAAATAAATA 
AAT AAAAT GAAT T T C AGC T AGAAGAGCC T TAT T CC AT TTTCCTTTTTAT T AAAC AT 
CTGGCATAAGTTGGTAAGTATGTGAAGTTTATCATATATTCTTATGCGAATTATTA 
TTTTCGCCTTTTTTTTTATAATTCTGTCTGGGATTTGAATAGTAGAGTTTGAATTC 
AGGAAGGAC AC C T GT GAT AGG AC AAT AAAAT C T A 

Sequence ID 1374 SEQ ID NO: 468 

GAAAGCACATATGATATACATGTGTGTCATATGTATTATTTTGTTTGCCATCTGAG 
TCTTCAAAATTTGTTACAGAATACCTGCATATTAATATTTCAAGGTATGGATTAAT 

Sequence ID 1378 SEQ ID NO: 469 

C T GAG TAT T AAC T AAAAA AAAAAAAAAA AAAAAAA AAAAA 

Sequence ID 138Q SEQ ID NO: 470 

CCAAACCCAACTGGTCCAGTAGGATACTCACCTTACAGGGGGCGTCTCAAGAGTCT 
CACAGTTCCCTTGGGTCTTAAGAGACTCACTGTTGGACCAGGCGTGGTGACTCACG 
CCTGTAAAACCAGCACTTTGGGAGGCCGAGGCGGGCGGATCAGTTGAGGTCAAGAG 
TTCAAGACCAGCCTGACCAAGGTGCTGAAACCCCGTCTCTACTAAAAATACAAAAA 
TTAGCCAGGCATGGTGGTGTGCGCCTGTAATCCCAGCTACTCCAGAGGCTGAGGCA 
GGAGAATCTCTTGAACCCAGGAGGTGGAGGTTGCAGTGAGTCGAGATCATGCCACT 
GC AC TCCAGCCTGGGT G AC AGAGC GAG AC T C C G T C T T AG A AAA A AAA AAA A AAA A A 
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AAAAGAACCTCACAGTTCAGCAGGGTTCTAGCATGAGACAATGAGGACAAGGGTAG 
GTGAGCAGGTGGAAAGAGTGAGAACAGGTCAATTGTGATGGAGAAAATAATAAAGA 
CAGAAAAGGCAGAAGACTGCCTGGCAGAAGACCTGTCCCAGCAGATACAAAAATAC 
AGACAACAGGAGCCAGCATAGACCCTTGACCTGTGTAAGTCTTTCTCAGGCCTTCT 
T T T AAGT AGAAAC AT GC C T T T GAAAAAAAGT T T T AAT AAAC AGGAAAAT C AT AAAT 
CCCTATTTACATAAATAATATATCCTGGTCTTATTCTTAAAACCATTGATTTTTCA 
CGGCTCATTAANAAAGCTGGGCGAGGTGGCTCACGCCCGTCATCCTAGCACTTTGG 
GAGGCCGAGGCGGGCANATCACAAGGTGAGGAGTTGGGAGACCAGCCTGACCAACA 
CGGTGAAACCCAGTCTCTACTAAAAATACAAAAATTANCTGGGGGTGGTGGTGTGT 
GCCTGTAATCCAAGCTACTCGGGAGGCTGAGGCAGGA 



Sequence ID 1382 SEQ ID NO: 471 

C T T AC T AC C T C C AAC AT G AAAC AAG C AG C C C C GC AC T T C T C G AAG G T C T GAG T T AC 
TTGGAATCGTTTTACCACATGATGGACAGAAGGAATATTTCAGATATCTCTGAAAA 
CCTCAAGCGTTACCTTCTTCAGTATTTTAAGCCAGTGATTGACAGGCAAAGCTGGA 
GTGACAAGGGCTCAGTCTGGGACAGGATGCTCCGCTCGGCTCTCTTGAAGCTGGCC 
TGTGACCTGAACCATGCTCCTTGCATCCAGAAAGCTGCTGAACTCTTCTCCCAGTG 
GATGGAATCCAGTGGAAAATTAAATATACCAACAGATGTTTTAAAGATTGTGTATT 
CTGTGGGTGCT C AGAC AAC AGC AGGAT GGAAT T AC C T T T T AGAGC AAT AT GAAC T G 
T C AAT GT C AAG T GC T GAAC AAAAC AAAAT TCTGTATGCTTTGT C AAC GAGC AAGC A 
T C AGGAAAAGT T AC T GAAGT T AAT T GAAC TAG G AAT GGAAGGAAAGGT TAT C AAG A 
CACAGAACTTGGCAGCTCTCCTTCATGCGATTGCCAGACGTCCAAAGGGGCAGCAA 
CTAGCATGGGATTTTGTAAGAGAAAATTGGACCCATCTTCTGAAAAAATTTGACTT 
GGGCTCATATGACATAAGGATGATCATCTCTGGCACAACAGCTCACTTTTCTTCCA 
AGGATAAGTTGCAAGAGGTGAAACTATTTTTTGAATCTCTTGAGGCTCAAGGATCA 
CATCTGGATATTTTTCAAACTGTTCTGGAAACGATAACCAAAAATATAAAATGGCT 
GGAG AAGAAT C T T C CGAC T C T GAGGAC T T GGC T AAT GG T T AAT AC T T AAAT GGT C A 
ATAGAAAAAGTAGGCTGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGA 



Sequence ID 1387 SEQ ID NO: 472 

j ^ ^ ^ ^\ ^ ^ ^ rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp rp 1^1 i^i i^i i^i iji rj^ i^i rji rj^ i^i i^i i^i i^i 

T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T C.Z\.G' T 
GTTAAAGTAGGTTTGTCGACGCGGCCACGAATTTCCCGGGGACCAA 



Sequence ID 1389 SEQ ID NO: 473 

TTTTTTTTTTTTTTTGGGAGTCAGTTTTCTTTTCTTTTCTTTCTTTTTTTTTTTTT 
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GNTTTTCGGAAACGGAGTCTCGCTTTCTCGCCCACTCTGGAGTGGNGCAGTGGGGN 
GGTCTCAGCTCACCACAGCCTCCACCTCCTGGGCCCAAGCGATCCTNTCACCTCAG 
CCTCCTGCGTAGCTGGGACTACAGGCGTGCACCACCATTCCCAGGTAATTTTTGTA 
TTTTTTGTANANACAGGGTTTCACTGTTGTTGCCCAGGCTGGTCTCGAACTCCTGC 
TTCAGTCTGCCANAATGCTGGATTCTAGGCGTGAGCCACCGNGCCTGGCCCAAAAG 
TTACTTTTCTTACAGAAGCAAAGCTTTAATGCATTTTACTGAATGCTTATAGCTTT 
GTAGATACTGAAAAGAGTATGAGCGTCACATACAGACACATNTAACAGCACTGCCT 
CCAACCAGCCCCTACCCACTGGTCAGGNGAGTAANAATCAAAATTCTTTTCTGNGA 
GTGGAACGGAAATTTCATCTCTCCTCCTCAGGCAAGTAGTTAANAGGCTGGNGGGA 
GTCATGGCCCCATTTTGTTCAAAATACAAGCTCCACAGGAACAAAAGGCTGAACTG 
CTCACCTCCCAACTGATGAACCTCGTCTTTGTTCCATGTCAAAGGGGCCTTTGTGT 
T AC T GC AGC AG AAAC T C C AGC T AT C AAACC AT C AGGC ACC AAAAG T AAAAC T CC T T 
T C T C T AAAAAG AC CTCTCTTT AC C T GAG C C T T T C AAT GCATCTTTGCCC C CAN AT A 
ATCCTGGATGAGATAATCCCCAGAGGAANACCAGCGCTTGCCTAGTGAAATTATAC 
TAT GAG AC A AG G G T A A A AG AC C T C A A AN AC CGGGTTGGCAGGT A AG GGAGTAGGGN 

Sequence ID 1390 SEQ ID NO: 474 

TCNGTGGCACCCGTTTCCGGCACCTTCAGACTCTGAAGAGCCACCTGCGAATCCAC 
ACAGGAGAGAAACCTTACCATGTACGTAAGCCTCTTGAGGCCGCTCTCTGACCTGC 
GGGGATGTGGAGGGCAGGGAAGGAGGTGGAGCGCAGGGAAGGAGGTGGAGCAGGGA 
GGCAGTGGAACTGTTTGCTCCCATCTCAAGCACACAGTGGGGCAACCACTACGCTA 
ATGGTTGGAAGACCTAGATCTGGGCCCAATGGCCAGACACCCTGCTTGACCTTGGC 
CCAAGCATTAGGGGACTCATCTTTAAAATGAGGGTATGGGACTAGATGATCTGGGC 
C T T AGGAGAGG AGT 

Sequence ID 1391 SEQ ID NO: 475 

CGGCTNCTACCCTGCGGAGATCACACTGACCTGGCAGTGGGATGGGGAGGACCAAA 
CTCAGGACACCGAGCTTGTGGAGACCAGGCCAGCAGGAGATGGAACCTTCCAGAAG 
TGGGCAGCTGTGGTGGTGCCTTCTGGAGAAGAGCAGAGATACACGTGCCATGTTCA 
GCACGAGGGGCTGCCGGAGCCCCTCACCCTGAGATGGAAGCCGTCTTCCCAGCCCA 
CCATCCCCATCGTGGGCATCGTTGCTGGCCTGGCTGTCCTGGCTGTCCTAGCTGTC 
CTAGGAGCTATGGTGGCTGTTGTGATGTGTAGGAGGAAGAGCTCAGGTGGAAAAGG 
AGGGAGCTGCTCTCAGGCTGCGTCCAGCAACAGTGCCCAGGGCTCTGATGAGTCTC 
TCATCGCTTGTAAAGCCTGAGACAGCTGCCTGTGTGGGACTGAGATGCAGGATTTC 
T T C AC AC CTCTCCTT T GT GAC T T C AAGAGC C T C T GGC AT C T C T T T C T GC AAAGGC A 
TCTGAATGTGTCTGCGTTCCTGTTAGCATAATGTGAGGAGGTGGAGAGACAGCCCA 



- 282- 

Marked-Up Copy 
CCCCCGTGTCCACCGTGACCCCTGTCCCCACACTGACCTGTGTTCCCTCCCCGATC 
ATCTTTCCTGTTCCAGAGAAGTGGGCTGGATGTCTCCATCTCTGTCTCAACTTCAT 
GGTGCGCTGAGCTGCAACTTCTTACTTCCCTAATGAAGTTAAGAACCTGAATATAA 
ATTTGTTTTCTCAAATATTTGCTATGAAGGGTTGATGGATTAATTAAATAAGTCAA 
T T C C T G G AAG T T GAG AG AG C AAA T AAAG AC C T G AG AAC C T T C C AN AA T C C G 

Sequence ID 1392 SEQ ID NO: 476 

TGAAACAAAATGAATTTNTATGGGTAAGAGAGGGTAATATTTTAGAGTTGTGTTAC 
AAAACTACAAATTTTTATTAAATTAATAAATCAGAATACTAAATCCATGTGTTTTT 
TTCTTTCTTAAAAAATATCTTTTGGCTGGGCACGGTAGCTCATGGCTGTAATCCCA 
GCACTTTGGGAGGCTGAGGTGGGTGGATCGCCTGATGTCAGGAGTTCAAGACCAGC 
CTGGTCAACATGTTGAAACCCCATCTCTACTAAAAATATAAAAATTAGCCGGTGTG 
GTGGTGGGCGCCTGTAATCCCAGCTACTCAGGAGGCTAAGGCAGGAGAATTGCGTG 
AACCCAGGAGTTCAGTGATGTAGCGGGGAGCTGAGATTGTGCCACTACACTCCAGC 
C T GG AT G AC AG AG T GAG AC T C CAT C T C AAAAAAAAAAAAAAAAAA 

Sequence ID 1394 SEQ ID NO: 477 

GCATAATGTGAGGAGGTGGAGAGACAGCCCACCCCCGTGTCCACCGTGACCCCTGT 
TCCCATGCTGACTTGTGTTTCCTCCCCAGTCATCTTTCCTGTTCCAGAGAGGTGGG 
GC T GG AT G T C T C C AT CTCTGTCT C AAC T T TAT G T GC AC T G AGC T GC AAC T T C T T AC 
TTCCCTACTGAAAATAAGAATCTGAATATAAATTTGTTTTCTCAAATATTTGCTAT 
GAGAGGT T GAT GGAT TAAT TAAATAAGTCAAT T C C T GG AAT T T GAGAGAGC AAAT A 
AAG AC C T GAG AAC C T T C C AG AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

Sequence ID 1395 SEQ ID NO: 478 

CTTACCATGTCAGTGCACAGAAATGCTGTCTTGGGATGTAGGAAAAATAAATCCAC 
AAAAGC T ACC AAGT T T GAAGGGGAC CAT GAGT C T T C AGGC T GGAGC T T C C AAAC C A 
GAT G AAAAC C C C AC AAT T AAC C T GC AG T T T AAG AT C C AGC AGC T GGC C AT T T C T GG 
ACTCAAGGTGAATCGTCTGGATATGTATGGAGAAAAGTACAAACCCTTTAAGGGCA 
TAAAATACATGACCAAAGCTGGGAAGTTCCAAGTTCGAACCTGAAGGGAGCATTTG 
CTGAGGGAATAGTCTTGCACATTTTTTCATTTCTTACTTGTCTAAAAGTAAAAAAA 
AATATCAGCCTGTCTCCTAGGTCAGTCCCCTCCTGGACCCACCCGCTCCCTTTTTT 
CCTTAGCCTTCAGTGCCATGGAACTAATCAAGGGAGGAAAAGGTCACCAGGGAGAA 
C T GG AC AGAAC T GAAAC AC AGC AAC AC C AGT T C T C AAGGAC AAGG T GT G T GAT GGG 
GGTAGGAAGCTTGGTGCTTATGTAACCATTTTAAACGTGGTTTCTATAGGAAAGAC 
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CAACATTTGTTTAGCTTGCTTGGCTTTAATTATCTAAAGCCAATGAAAGACTTCTT 
T GT T GAT T T T T T AAGAT AGAAAGAT T 

Sequence ID 1396 SEQ ID NO: 479 

C AAAC AC T AT G T TAT T T T AT GAANAAGAC T T GAAC AT C T AT GGAT T T T GGT AT T T G 
CAAGGGGTGAATGGGGTATTTGCAAGCAGTGAATGAGGAGGCCTGGAACCAATCTT 
CTGCTGATATTGAGGCACAACTGAAAAAGGTATATTACTTAAATCTCTTATTGTAT 
TGTAAACTGTATAAGTAATGAAATTAAAAGGCAGAAATTGTCAGACTGAATAAAAT 
GAAAAGACCAAACAATATGCTGCTTACAAGAAACACAATTCAAATATAAGGACACA 
AT T AGT T T AAAGGAAAAGAAC T GGAAAAGAT AT AC CAT GAT AAC AC AAGT C AGAAG 
AAAGCTGCTGTGGATATATTAATATGAGATGTAGATTTCAGAGCAGTGAATATTGC 
C AGGC AT AAAGAAAGT T AT T AC AT AAT AAT T AAGGT AT C AGT T CAT C AAGAAGAT G 
TAATAACCCTAAGTATTTATACAACTAATATCAGAGCTTCAAAATACATGAAGCAA 
AAAC C AGT GGAAT T GAT AGGAGAAAC AC AC AAT T AC AC AAT T AT AGT C AGAAT T T T 
CAACATATCTTTCTCAATGGAGAAAACAACTAGACAGGAAATCATTAAGGATATAG 
ATGATTTAAATTATATGATCAACTACCTGGACGTAATTGGCATTTATGGAACACTG 
CACCACCAACAGCAGAGTACATATTATTTTCAAGTACACAGAAAACAGTTACCAAT 
ATAGACCATTTTCTGGGTCATAAAACACATCTCAATAAATGTAAAACAATTAATGT 
TATATAAAGTATGTGCTCTGACCNCAAAGGAATTAGAGATCAATAAAAGAACATCT 
T T GAAAAAT C T C ACN T AT T T AAAAAC T AAT AAC T C AC T T C T AAAT AAC TCCTGTNT 
C A AG AG A A T N A A AN G G 

Sequence ID 1397 SEQ ID NO: 480 

CCCAGCCTCACTGCGCCCCGTCAGGCCAGGCAGCTGCCCTCAGGGTCTGCCAAGGT 
GGGGGTCAAGGGCCATGGGGGCAGGTAGCTCTGCCTGCAAAGCCCACAAGCATGTC 
AGAT C AC C T GGGC T GC AG AC AGAC AAAC AC C T GAGC T GT T C T GAAT AC C T T C AGG T 
TCCTGGCCTCGCT GAGC AAGT GC AGAAAT T T T T AC C T T C AAGGAT C AGGGT T T T T C 
TGTTTGTTTGTTTTTTAACACACACATATGTGAACAAAGAGTATGCGTTTGTACTG 
GCAGAAGAAGCGTCT GGT AAGACAACCAGC AAGT T AAC AAT GGT CACCTCCAGAAA 
TGGGCTGGGTAAACCAAAGAATTTTTTTGTTTTTGTTTTTTTTGAGTCAGGGTCTA 
GCTCTGTCACCCAGGCTGGAACGCACTGGTGTGATCACGGCTCACTGCAGCCTTGA 
CCTCCCTGGCTCAAGCAATCCTCCCAGCTCAGCCTCCTGAGTCGTTGGGACTACAG 
GCACGTGCCACCACGCCTGACACATTTTTTAAATTTTTGTAGAGACAGTGTTTCAC 
CATGTTGCCC AGGC AGGTCTC AAAC TCCTGGGCTCAAGTGGTCCTCCAGCTTCAGC 
CTCCCAAAGTGCTAGGATTATAGGTGTGAGCCACAGTGCCCAGCCCCGTAGTGGAG 
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AATTTCTGTTGAATGAACCAAAAGCAACTGCCAACCTCTCCATGCACCATGTGTTT 
CAGAGGAGAAAGCACAGTGAAGAATGCAGTGTGTTCTGAGGTCCTGTCACCCCTGA 
GGCTGTGTGTGTCCTTTGCCAAATTAAAGAGTCTTACTGAATGCGGTGCATCCAGG 
AG AC AG G C CN AG G T T T G G AC T G G T AAAA AAAAA 

Sequence ID 1399 SEQ ID NO: 481 

CAGACACCTGGNAGAACGGGAAGGAGACGCTGCAGCGCGCGGACCCCCCAAAGACA 
CATGTGACCCACCACCCCATCTNTGACCATGAGGCCACCCTGAGGTGCTGGGCCCT 
GGGCTTCTACCCTGCGGAGATCACACTGACCTGGCAGCGGGATGGCGAGGACCAAA 
CTCAGGACACCGAGCTTGTGGAGACCAGACCAGCAGGAGACAGAACCTTCCAGAAG 
TGGGCAGCTGTGGTGGTGCCTTCTGGAGAAGAGCAGAGATACACATGCCATGTACA 
GCATGAGGGGCTGCCGAAGCCCCTCACCCTGAGATGGGAGCCATCTTCCCAGTCCA 
CCGTCCCCATCGTGGGCATTGTTGCTGGCCTGGCTGTCCTAGCAGTTGTGGTCATC 
GGAGCTGTGGTCGCTGCTGTGATGTGTAGGAGGAAGAGTTCAGGTGGAAAAGGAGG 
GAGCTACTCTCAGGCTGCGTCCAGCGACAGTGCCCAGGGCTCTGATGTGTCTCTCA 
CAGCTTGAAAAGCCTGAGACAGCTGTNTTGTGAGGGACTGAGATGCAGGATTTCTT 
CACGCCTCCCCTTTGTGACTTCAAGAGCCTCTGGCATCTCTTTCTGCAAAGGCACC 
TGAATGTGTCTGCGTCCTTGTTAGCATAATGTGAGGAGGTGGAGAGACAGCCCACC 
CTTGTGTCAACTGTGACCCCCTGTTCCCATGCTGACCTGTGTTTCCTCCCCAGTCA 
TCTTTTTTGTTCNCAATAGGTGGGGCCTGGATGTCTCCATCTCTGTNTCA 



Sequence ID 1440 SEQ ID NO: 482 

TTATAAGGTACTTTTAAGGTATTTTAGTTGTCTTAGTCTATATTTCTGTACTCACC 
TTTCTTTATCCACTCATCAGTTGATGGGCATGTAGGTTGGTTCCATATCTTTGCAA 
TTCTGAATTGTGCTGTGATCAGGTGTCTTTTTAGTATAATGATTTACTCTCCTTTG 
GGTAGATACCCAGTAGTGGGATTGCTGGATCGAATGGTTTTTATAATTTTCTATTT 
T AC C AC AG TTTCTCTC T GC AT TTTTCCTCTTT G AC C AC T AAC CAT G T G AAAT T C T C 
ATATTGACCTTTATAATGATCATGAACTCTTAGTATCATTGGGAAGGCCACATTTG 
CCAC TTAT GAT TGTAAACCTTATCCTCCATTTTTCCTGTTATTGTTGGTGC AAAAA 
GCACCTATTATACCAGGACTTTAAAAATCAGTCTGATAAGTCTTTGATAAGTCTAA 
TAATAATAACTGATAAGTCCATTGAATTTGCTTCTGATTACTTTTTCTTTAGTAGC 
TAAACATGTATGTACTCCTATGATTACAATGAACACTCCTCTCCATTTAAATTAAT 
TATTTACATTGATGAAATAGCAAAATGTTAATGACTAAATACTGTCTTGGTTTTTT 
CGTTCCAGGTCAGTCAATATTAACTTCTTATAATTTTCTTTTTTTTCTTT 
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Sequence ID 1447 SEQ ID NO: 483 

GCAAGGACTAACCCCTATACCTTCTGCATAATGAATTAACTAGAAATAACTTTGCA 
AGGAGAGCCAAAGCTAAGACCCCCGAAACCAGACGAGCTACCTAAGAACAGCTAAA 
AGAGCACACCCGTCTATGTAGCAAAATAGTGGGAAGATTTATAGGTAGAGGCGACA 
AACCTACCGAGCCTGGTGATAGCTGGTTGTCCAAGATAGAATCTTAGTTCAACTTT 
AAATTTGCCCACAGAACCCTCTAAATCCCCTTGTAAATTTAACTGTTAGTCCAAAG 
AGGAACAGCTCTTTGGACACTAGGAAAAAACCTTGTAGAGAGAGTAAAAAATTTAA 
CACCCATAGTAGGCCTAAAAGCAGCCACCAATTAAGAAAGCGTTCAAGCTCAACAC 
CCACTACCTAAAAAATCCCAAACATATAACTGAACTCCTCACACCCAATTGGACCA 
ATCTATCACCCTATAGAAGAACTAATGTTAGTATAAGTAACATGAAAACATTCTCC 
TCCGCATAAGCCTGCGTCAGATTAAAACACTGAACTGACAATTAACAGCCCAATAT 
CTACAATCAACCAACAAGTCATTATTACCCTCACTGTCAACCCAACACAGGCATGC 
TCATAAGGAAAGGT 

Sequence ID 1448 SEQ ID NO: 484 

GGCCACCGGGTGCAAGGTCAGGGCTGGGGTGGAGGCTGGGAAGCCCAGGGCTTGGC 
CCACTGTGGCCGCCTTGTGTGGTCACTGCTTTCCTGGGCCTGCTGTGAGCTCCCTC 
TAGGACCCCAGGCCTGTCTGGTGGGTCACTGTGACCACCACCTTGCACAGCACCTG 
GCGCGTGGCAGGTGCTCAAACATTACTTGTTTCGGAATGAACTTCATCTTGCTCTT 
GGCTTTTTGACTAATGCTGTGGAACATCTGACTAATTAGTGACTCTTTGGGGCCCC 
CAGTTTCCCAGCTATAAAGTGGTAATATTAAGATAATAATTCGGCCGGGCGCGGTG 
GCTCACGCCTGTAATCCCAGCAGCACTTTGGGAGGCCGAGGTGGGCAGATCACGAG 
GTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCATCTCTACTAAAAA 
TACAAAAAATTANCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCANGAG 
GCTGANGCAGGAGAATGGTGTGAACCCGGGAGGCAGAGGTTGCAGTGAACCAAGAT 
CGNNCCACTGCACTCCAGCCTGGGCAACAGAGCGAGACTCCATCTTAAAAAA 

Sequence ID 1449 SEQ ID NO: 485 

AATCAGGGCCGCAGTGTGTTCTGCGCCTGCCCAGAGCTGACTCCTGATTTAACCGC 
TGGCGTAACCGCGGGTTGCACGCATGCGTGCTGAAAAGCCTTTCACCCTCACGTGG 
TTTCTTTTTTAACCAGTCATCAAGCGAGGCTCGCGCGCAGGCCCCGCGTTGGAAAA 
TGGCGGGGAAGCTGAAACCTCTGAATGTGGAGGCGCCAGAAGCTGCTGAGGAGGCT 
GAAGGTAGTGAGGGCAAGTGGGCTGCACTCCTTTCTCTCCAACCAGGGCAGAAAGG 
AGGGAGGATTCGTCCCATTACAATAATGAAATAATGATATTCTAATTTTTTTAAAT 
AAAATGTTAAGCCTTTTGTTATTGAA 
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Sequence ID 1450 SEQ ID NO: 486 

GGAAANCATGAGGCTTCGGGAGCCGCTCCTGAGCGGCAGCGCCGCGATGCCAGGCG 
CGTCCCTACAGCGGGCCTGCCGCCTGCTCGTGGCCGTCTGCGCTCTGCACCTTGGC 
GTCACCCTCGTTTACTACCTGGCTGGCCGCGACCTGAGCCGCCTGCCCCAACTGGT 
CGGAGTCTCCACACCGCTGCAGGGCGGCTCGAACAGTGCCGCCGCCATCGGGCAGT 
CCTCCGGGGAGCTCCGGACCGGAGGGGCCCGGCCGCCGCCTCCTNTAGGCGCCTCC 
TCCCAGCCGCGCCCGGGTGGCGACTCCAGCCCAGTCGTGGATTCTGGCCCTGGCCC 
CGCTAGCAACTTGACCTCGGTCCCAGTGCCCCACACCACCGCACTGTCGCTGCCCG 
CCTGCCCTGAGGAGTCCCCGCTGCTTGGTAAGGACTCGGGTCGGCGCCAGTCGGAG 
GATTGGGACCCCCCCGGATTTCCCCGACAGGGTCCCCCANACATTCCCTCAGGCTG 
GCTCTTCTACGACAGCCAGCCTCCCTCTTCTGGATCAGAGTTTTAAATCCCANACA 
GAGGCTTGGGACTGGATGGGAGAGAAGGTTTGCGAGGTGGGTCCCTGGGGAGTCCT 
GTTGGAGGCGTGGGGCCGGGACCGCACAGGGAAGTCCCGAGGCCCCTCTAGCCCCA 
AAACCANAGAAGGCCTTGGAGACTTCCCTGCTGTGGCCCGAGGCTNAGGAAGTTTT 
GGAGTTTTGGGTCTGCTTANGGCTTCNAGCAGCCTTGCACTGAGAACTTTGGTAGG 
GACCTCGAGTAATCCACTCCNTTTTNGGGACTGACGTGAGGCTCCCGGTGGGGAAA 
GANACTGACCTNTC 

Sequence ID 1453 SEQ ID NO: 487 

CCGACCTGTCTCGCTCCGTGGCCTTAGCTGTGCTCGCGCTACTCTCTCTTTCTGGC 
CTGGAGGCTATCCAGCGTACTCCAAAGATTCAGGTTTACTCACGTCATCCAGCAGA 
GAATGGAAAGTCAAATTTCCTGAATTGCTATGTGTCTGGGTTTCATCCATCCGACA 
T T GAAG T T G AC T T AC T G AAG AAT GG AG AG AG AAT T GAAAAAG T GG AGC AT T C AG AC 
TTGTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTACACTGAATTCACCCC 
CACTGAAAAAGATGAGTATGCCTGCCGTGTGAACCATGTGACTTTGTCACAGCCCA 
AGATAGTTAAGTGGGATCGAGACATGTAAGCAGCATCATGGAGGTTTGAAGATGCC 
GCATTTGGATTGGATGAATTCCAAATTCTGCTTGCTTGCTTTTTAATATTGATATG 
C T TAT AC AC T T AC AC T T T AT GC AC AAAAT GT AGGGT T AT AAT AAT GT T AAC AT GGA 
CATGATCTTCTTTATAATTCTACTTTGAGTGCTGTCTCCATGTTTGATGTATCTGA 
GCAGGTTGCTCCACAGGTAGCTCTAGGAGGGCTGGCAACTTAGAGGTGGGGAGCAG 
AGAATTCTCTTATCCAACATCAACATCTTGGTCAGATTTGAACTCTTCAATCTCTT 
GCACTCAAAGCTTGTTAAGATAGTTAAGCGTGCATAAGTTAACTTCCAATTTACAT 
ACTCTGCTTAGAATTTGGGGGAAAATTTAGAAATATAATTGACAGGATTATTGGAA 
ATTTGTTATAATGAATGAAACATTTTTGTCATATAAGATTCATATTTACTTCTTAT 
ACA 



Marked-Up Copy 

Sequence ID 1454 SEQ ID NO: 488 

TAAATAGGGAATCCTTTCCCCATTGCTTGTTTTTCTCAGGTTTGTCAAAGATCAGA 
TAGTTGTAGATATGCGACGTTATTTCTGAGGGCTCTGTTCTGTTCCATTGATCTAT 
ATCTCTGTCACATGCACACGTATGTTTGTTGTGGCACTATTCACAGTGGCAAAGAC 
TTGGAACCAACCCAAATGTCCAACAATGATAGACCGGGTTAAGAAAATGCGGCACA 
TATACACCATGGAATACTATGTAGCCATAAAAAATGATGAGTTCGTGTCCTTTGTA 
GGGACATGGATGAAATTGGAAATCATCATTCTCAGTAAACTATCGCAGGAACAAAA 
AACCAAACACTGCATATTCTCACTCATAGGTGGGAATTGAACAGTGGGAACACATG 
GACACAGGAAGGGGAACATCACACTCTGAGGACTGTTGTGGGGTGGGGGGAGGGAG 
GAGGGATAGCATTGGGAGATATACCTAGTGCTGGATGACGAGTTAGTGGGTGCAGC 
GCACCAGCATGTCACATGTATACATATGTAACTAACCTGCACATTGTGCACATGTA 
C C C T A A A A C T T A A G G T A T 

Sequence ID 1456 SEQ ID NO: 489 

CCGCAACAAACACGGGAGTGCAGATATCGCTGCGATGGGCTGATTTCCTTTATTTG 
GGTATATACCCAGCAGTGGGATTGCTGGATTGTATGGTAGCTCTATTAGTTTTTTG 
AGGAACCTCCAAACTGTTCTNCATAGTGGTTGTACTCATTTACATTCCCACTGTGA 
AC C C T G AAAA T T T GAG G C AG G T C T C AG T T AAA T T AG AAAG T T GAT T T T G C C A AG T T 
GGGGACACGCACTCGTGACACAGCCTCAGGAGGAACTGATGACATGTGCCCAGGTG 
GTCAGAGCACAGCTTGGTTTTATACATTTTAGGGAAACCTGAGCCATCAATCAACA 
TACGTAAAATGGGCCGGGCACAGCAGCTCAAGCTGTAATCCCAGCACTCTGGGAGG 
CCGAGGCGGGTGGATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGCCAACATGG 
TGAAACCCCGTCTCTATTAAAAATACAAAGCTTAGCTGGATGTGGTGGCGCATGCC 
TGTAGTCCCAGCTGCTCTAGGAGGCTGAGGCATGAGAATTGCTTGAACCTGGGAGG 
CAGAGGCTGCAGTGAGCCGAGATCGAGCCACTATACTCCAGCCTGGTCAACAGAGT 
GAGACCCTGTCT 

Sequence ID 1460 SEQ ID NO: 490 

CCACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTG 
AGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGT 
GGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGA 
GTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGG 
CTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAAC 
CTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGA 
TCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACT 
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TTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGT 
GTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATT 
T C T AT T AAAGG T T C C T T T GT T CCC T AAG T CC AAC T AC T AAAC T GGGGGAT AT TAT G 
AAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTG 

Sequence ID 1490 SEQ ID NO: 491 

ATGGGCATCTCTCGGGACAACTGGCACAAGCGCCGCAAAACCGGGGGCAAGAGAAA 
GCCCTACCACAAGAAGCGGAAGTATGAGTTGGGGCGCCCAGCTGCCAACACCAAGA 
TTGGCCCCCGCCGCATCCACACAGTCCGTGTGCGGGGAGGTAACAAGAAATACCGT 
GCCCTGAGGTTGGACGTGGGGAATTTCTCCTGGGGCTCANAGTGTTGTACTCGTAA 
AACAAGGATCATCGATGTTGTCTACAATGCATCTAATAACGAGCTGGTTCGTACCA 
AGACCCTGGTGAAGAATTGCATCGTGCTCATCGACAGCACACCGTACCGACAGTGG 
TACGAGTCCCACTATGCGCTGCCCCTGGGCCGCAAGAAGGGAGCCAAGCTGACTCC 
T G AGG AAG AAG AG AT T T T AAAC AAAAAAC GAT C T AAAAAAAT T C AG AAG AAAT AT G 
ATGAAAGGAAAAAGAATGCCAAAATCAGCAGTCTCCTGGAGGAGCAGTTCCAGCAG 
GGCAAGCTTCTTGCGTGCATCGCTTCAAGGCCGGGACAGTGTGGCCGAGCAGATGG 
CTATGTGCTAGAGGGCAAAGAGTTGGAGTTCTATCTTAGGAAAATCAAGGCCCGCA 
AAGGCAAATAAATCCTTGTTTTGTCTTCACCCATGTAATAAAGGTGTTTATTGTTT 
TTGTT 

Sequence ID 1491 SEQ ID NO: 492 

CTTNCACATACTGATTGATGTCTCATGTCTCTCTAAAATGTGTAAAACCAAGCTGT 
GCCCCAACCACCTTGGGNACATGTGGNGAGGACCTCCTGAGGCTGTGTCATGGGCA 
CACCTTAACCCTGGGAAAATAAACTTTCTAAACTGACTTGAGAGCTGTCTCAGATA 
TTCTGAGCTTACAGTTATTGTGAAATCATTTTAATTATAAATTAAGTGGAGATTTA 
CTTAAAATCATGTGTAGAAGTAGCCTGTGATATAGTCCTAGATACATACATTATCA 
T C T T AT G T AT CTTCCCTCCCTCTTC C AGG T T C T G AT AAAAAC AG AT G AAAT C T G AA 
AGACCATGACAGTAGTATTTTGAAAATGACAGTATTTGAAATTAAAAAATTGTAAA 
AGTGTTCTGTTCTATCACTGCCAAAGGATAAGTTACAAATTGGTTCTTGGAACGTA 
ATATGTACTATGTGCTTGCTATTTAATAATTTACCAGTCTTAGTCTTTTTTATTCA 
GACTAATTTTACCTTTTTTTAACCTATGACTCTTTAGTTATAGTAGTACAAAAAAG 
TAGTTTTAGTTATAGTTTTAGTTGTAGTACAAAAAAGCATTTTCTGTAAGCTTAAT 
TTCTTTCCCCTTCCCGCTTTCCCAGTCAGATGACTTTAGTGATTTGGAGTTGTGTG 
CTTTATAAGTGCATTCCTCAGAGGACTTAATATTACTAAGATTTTAGCAACNCTGA 
AATATGTT 
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Sequence ID 1492 SEQ ID NO: 493 

TGTNCCTGTAGTCCTGTGTGGGAGGATTGCCTGAGCCTAGGAGCTCAAAGTTGCAG 
TGAGCCCAGATCGNGNCATTGCAGTCCAGCCTGGGTGACAGAGTGAGACCCCATGT 
C AAAAAAAAAAAAAC AAAAAAC AG GGGCCTGCCT CAN C C AG C AG G T GAG G T C T G C C 
ACTGAGAGCACTTCTAGCAGCAGGAACAGCCTCCACCCCCACACTGCAATCAAGTT 
TTTTGGGTCAGCCTTAGGAGCTAANAAAGGGCCTAGTTTGNCTAAATAGCAGGAGT 
TATATCCAGGGATCTTCAGGCCCAGGAATGCTAATGAGTAGGCATTCCATGGGCCC 
TGGGAATGGCTTTGTGTGCCANAAATGATGGCCACAAAGGCCTTGCTGCCTTTTTT 
CAAAATGGCTGCATCCAGCTGAGTGCTCTCTGCCAAAGGGGANAANAAAATAAGTC 
TCCAGTGCATTTAGATTGGTCTCTCATCATCTCTCTCCTTTTTGTTTTTATTAGTC 
TCCT T AAC CAAAAC T GC C AAG AAAGGC T TGGAAT T G AAAC AAAAC C T G AT AN AAN A 
GGTAAGAGGTTGTTCTTTT 

Sequence ID 1493 SEQ ID NO: 494 

T G TN T C AAAAAAAAAAAAAAG AAC GGN AAT G T AC T GG AG AT G T AT T T G AT AAC C AA 
GGNTTTAGGTAAATTTTCACCAGTATTAGTTNTATTTGCAAACTGAAAAATGTTGT 
AGGCT T AAT AT AAAAT AACC AC AT T AGT GAAC AT T AT AT C T C T T AG A AG A A AG G C C 
ATATTTTGCTCCTGCTTCTGTAAAAATATTATTTGTTTGAAGGGGAAATAATGGTA 
G T G T G AC C T T T C AC T T AAT TCCT AC T C C C T T AAT G T GAG AG AG AC AAAAT G AG C T G 
AAGAAGGAAAAT T C T GGAGT T AC AC T C C AC AACC T T GAAC AT AC T GAC GGAC AT C T 
CTGTTTTGACAACGATTTCTCCATGCCACCCATGCTNTAATGCCTTGTGGATCACG 
GACAACCCTCTTTGCACAAGCTACAGCATCAGCGATGTTATCTTGCAGCAAAGCAC 
TGCAGGATAAATGACAGGCATTAACTGCTCCTGGGGTTTTGCCATCATTACACCAG 
TAGCGGCTATTGATCTGAAATATCCCATAATCAGTGCTTCTGTCTCCAGCATTGTA 
GTTTGTAGCTCGTGTGTTGTAACCACTCTCCCATTTGGCCAAACACATCCAGTTTG 
CTAGGCTGATTCCCCTGTAGCCATCCATTCCCAATCTTTTCAGAGTTCTGGCCAAC 
TCACACCTTTCAAAGACCTTGCCCTGGACCGTAACAGAAAGGAGGACAAGCCCCAG 
AAC AAT GAGAGCC T T C AT GT T GAC 

Sequence ID 1494 SEQ ID NO: 495 

TTGGTACCCGGGAAATTCTTTGCCGCGTCGACGGCCGGTGAGGCAGATCACCTGAG 
CCCAGGAGTTCAGGACCAGCCTGGGCAGCATACCGGGATTCCATCTNNACTAAAAA 
C AGT AGGCT GGGTGTGGTGGCTC AT GTCTGTAAGCTCAGGACTTTGGAAGGCC AAG 
ATGGGAGGATCACTTGAGCCTGGGAGTTTGACACCAGCTTGAGCATCGTAGCCAGG 
CCCTGACTCTACAAAAAAGTGAAATAATTAGCCGAGTGTGGTGGTTCACACCTGTA 
ATCCCAGCTGCTCAGGAGGCTGAGGTAGGAGAATCATTTGAACCCGGGAGGTGGAG 
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GTTGCAGTTAGCCGAGATCACGCCATTGCACTCCGGCCTGGGCGATAAAGCGAGAC 
T C T G T C T C AAAAAAAAAAAAAA 

Sequence ID 1495 SEQ ID NO: 496 

ATTCGGGCCGAGATGTCTCGCTCCGTGGCCTTAGCTGTGCTCGCGCTACTCTCTCT 
TTCTGGCCTGGAGGCTATCCAGCGTACTCCAAAGATTCAGGTTTACTCACGTCATC 
CAGCAGAGAATGGAAAGTCAAATTTCCTGAATTGCTATGTGTCTGGGTTTCATCCA 
T C C G AC AT T GAAG T T G AC TTACT G AAG AAT GG AG AG AG AAT T GAAAAAG T GG AGC A 
TTCAGACTTGTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTACACTGAAT 
TCACCCCCACTGAAAAAGATGAGTATGCCTGCCGTGTGAACCATGTGACTTTGTCA 
CAGCCCAAGATAGTTAAGTGGGATCGAGACATGTAAGCAGCATCATGGAGGTTTGA 
AGATGCCGCATTTGGATTGGATGAATTCCAAATTCTGCTTGCTTGCTTTTTAATAT 
TGATATGCTTATACACTTACACTTTATGCACAAAATGTAGGGTTATAATAATGTTA 
ACATGGACATGATCTTCTTTATAATTCTACTTTGAGTGCTGTCTCCATGTTTGATG 
TATCTGAGCAGGTTGCTCCACAGGTAGCTCTAGGAGGGCTGGCACCTTAGAGGTGG 
GG AGC AG AG AAT TCTCTTATC C AAC AT C AAC AT C T T GG T C AG AT T T G AAC T C T T 

Sequence ID G6 SEQ ID NO: 497 

GGATTTTTGGTCCGCACGCTCCTGCTCCTGACTCACCGCTGTTCGCTCTCGCCGAG 
GAACAAGTCGGTCAGGAAGCCCGCGCGCAACAGCCATGGCTTTTAAGGATACCGGA 
AAAACACCCGTGGAGCCGGAGGTGGCAATTCACCGAATTCGAATCACCCTAACAAG 
CCGCAACGTAAAATCCTTGGAAAAGGTGTGTGCTGACTTGATAAGAGGCGCAAAAG 
AAAAGAATCTCAAAGTGAAAGGACCAGTTCGAATGCCTACCAAGACTTTGAGAATC 
ACTACAAGAAAAACTCCTTGTGGTGAAGGTTCTAAGACGTGGGATCGTTTCCAGAT 
GAG AAT T C AC AAGC G AC T CAT T G AC T T G C AC AG T C C T T C T GAG AT T G T T AAG C AG A 
TTACTTCCATCAGTATTGAGCCAGGAGTTGAGGTGGAAGTCACCATTGCAGATGCT 
T AAG T C AAC TAT T T T AAT AAA T T GAT G AC C AG T T G T T AAAA 
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Sequence ID &± SEQ ID NO: 498 nt:362 

CTTATTGAAAATTTTACTAATTTCTTACTTTTTAGGTTTTAGGAGAATACTTTTGGA 
T AAT T GAC T AGCC T C AC AT TAT AT T GAT AGAGGT T C T T GAAAAC T T T AAT GCC AAT 
T C AT GT AT C T TAT G AC T AAAAT AG AT AAT C CAT T T AG AAAT T T AAGT C AT T C T T GC 
GTGCTTGATATGTGTCAGCACTATCCAAGTTGCTAGGGGATACAATGGTGAAGTG 
AAAATATCAGCTAGGTGCCGGTGGCTCACACCTGTTATCCCAACAGTTTGGGAGG 
CCAGGGTGGGAGGATCACTCAAGCACANGCGTTTCACACCAGCCTGGACAACAT 
AC AAG AC C C C AT C T T T AC C AAAAG T TAAG 

Sequence ID 4-»9- SEQ ID NO: 500 nt : 3 82 

TTTTCTTAGAACTTTATTTTTTCTGGCCAGGCGCAGTGGCTCACACCTGTAATCCC 

AGCACTTTGGGAGGCCAAGGCAGGTCGATCACCTGAGGTCAGGAGCTCAAGACC 

AGCCTGGCCAACATGGTGAAACCCTGTCTCTACTAAAAATACAAAAATTAGCTGG 

GCGTGGTGGCGCATGCCTGTAATCCCANCTACTCAGGAGGCTGAGGCAGGAGAA 

TTGTTTGAACCCGGGAGGCGGAGGTTGCANTGAGCCGAGATTGCGCCACTGCACT 

C C AGC C T GGGC AAC AG AGC G AAAC T C C AT C T C AAAAAAAAAAAAAAAAAAC AAC 

CTTTATTTTTTCTGATTTTAAAAGTAATAACTAGTTTGTAGAAACATTAAAAGT 

Sequence ID 8-^2 -SEQ ID NO: 501 nt:55 9 

TCTTTCGGAAGCGCGCCTTGTGTTGGTACCCGGGAATTCGCGGCCGCGTCGACGC 
GGTCGTAAGGGCTGAGGATTTTTGGTCCGCACGCTCCTGCTCCTGACTCACCGCT 
GTTCGCTCTCGCCGAGGAAC AAGT CGGTCAGGAAGCCCGCGCGC AAC AGCC ATG 
GCTTTTAAGGATACCGGAAAAACACCCGTGGAGCCGGAGGTGGCAATTCACCGA 
AT TCGAATCACCCT AAC AAGCCGCAACGT AAAAT CCTTGGAAAAGGTGTGTGCTG 
ACTTGATAAGAGGCGCAAAAGAAAAGAATCTCAAAGTGAAAGGACCAGTTCGAA 
TGCCTACC A AG AC T T T GAG A A T C AC T AC A AG A A A A AC TCCTTGTGGT G A AG G T T C 
TAAG AC G T GGG AT C G T T T C C AG AT GAG AAT T C AC AAGC G AC T CAT T GAC T T GC AC 
AGT C C T T C T G AGAT T GT T AAGC AG AT T AC T T CC AT C AG TAT T GAGC C AGGAGT T G 
AGG T GG AAG T C AC CAT T GC AG AT GC T T AAG T C AAC TAT T T T AAT AAAT T GAT GAC 
CAGTTGTTT 

Sequence ID 7^7- SEQ ID NO: 499 nt : 4 64 

GCGGCTGCTGTTGGTTGGGGGCCGTCCCGCTCCTAAGGCAGGAAGATGGTGGCCG 
CAAAGAAGACGAAAAAGTCGCTGGAGTCGATCAACTCTAGGCTCCAACTCGTTAT 
GAAAAGT GGG AAGT AC G T CC T GGGGT AC AAGC AG AC T C T GAAGAT GAT C AG AC A 
AGGCAAAGCGAAATTGGTCATTCTCGCTAACAACTGCCCAGCTTTGAGGAAATCT 
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GAAATAGAGTACTATGCTATGTTGGCTAAAACTGGTGTCCATCACTACAGTGGCA 
ATAATATTGAACTGGGCACAGCAGCATGCGGAAAATACTACAGAGTGTGCACACTGG 
C T AT CAT T GAT C C AGG T G AC T C T G AC AT CAT T AG AAGC AT GC C AG AAC AG AC T GG 
T G AAAAG T AQ AAC C T T T T C AC C T AC AAAA T T T C AC C T G C AA AC C T T AA AC C T G C AA 
AATTTTCCTTTAATAAAATTTGCTTG 
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Product and Method 



The present invention relates to oligonucleotide 
probes, for use in assessing gene transcript levels in a 
cell, which may be used in analytical techniques, 
particularly diagnostic techniques and kits containing 
the same. 



